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This invention relates to the use of CG7956, aralarl , how (held out wings), 
CG9373, cpo (couch potato), Jafrad (thioredoxin peroxidase 1), or 
CG 14440 homologous proteins, to the use of polynucleotides encoding 
these, and to the use of effectors/modulators of the proteins and 
polynucleotides in the diagnosis, study, prevention, and treatment of 
obesity and/or diabetes and/or metabolic syndrome. 

There are several metabolic diseases of human and animal metabolism, eg., 
obesity and severe weight loss, that relate to energy imbalance where 
caloric intake versus energy expediture is "unbalanced. Obesity is one of the 
most prevalent metabolic disorders in the world. It is still a poorly 
understood human disease that becomes as a major health problem more 
and more relevant for western society. Obesity is defined as a body weight 
more than 20% in excess of the ideal body weight, frequently resulting in 
a significant impairment of health. Obesity may be measured by body mass 
index, an indicator of adiposity or fatness. Further parameters for defining 
obesity are waist circumferences, skinfold thickness and bioimpedance 
(see, inter alia, Kopelman (1999), loc. cit.). Obesity is associated with an 
increased risk for cardiovascular disease, hypertension, diabetes, 
hyperlipidaemia and an increased mortality rate. Besides severe risks of 
illness, individuals suffering from obesity are often isolated socially. 

Obesity is influenced by genetic, metabolic, biochemical, psychological, 
and behavioral factors, and can be caused by different reasons such as 
non-insulin dependent diabetes, increase in triglycerides, increase in 
carbohydrate bound energy and low energy expenditure. As such, it is a 
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complex disorder that must be addressed on several fronts to achieve 
lasting positive clinical outcome. Since obesity is not to be considered as 
a single disorder but as a heterogeneous group of conditions with 
(potential) multiple causes, it is also characterized by elevated fasting 
plasma insulin and an exaggerated insulin response to oral glucose intake 
(Koltermann J., (1980) Clin. Invest 65, 1272-1284). A clear involvement 
of obesity in type 2 diabetes mellitus can be confirmed (Kopelman P.G., 
(2000) Nature 404, 635-643). 

Hyperlipidemia and elevation of free fatty acids correlate clearly with the 
metabolic syndrome, which is defined as the linkage between several 
diseases, including obesity and insulin resistance. This often occurs in the 
same patients and are major risk factors for development of type 2 
diabetes and cardiovascular disease. It was suggested that the control of 
lipid levels and glucose levels is required to treat type 2 diabetes, heart 
disease, and other occurances of metabolic syndrome (see, for example, 
Santomauro A. T. et a!., (1999) Diabetes, 48(9):1 836-1841 and McCook, 
2002, JAMA 288:2709-2716). 

The molecular factors regulating food intake and body weight balance are 
incompletely understood. Even if several candidate genes have been 
described which are supposed to influence the homeostatic system(s) that 
regulate body mass/weight, like leptin or the peroxisome 
proliferator-activated receptor-gamma co-activator, the distinct molecular 
mechanisms and/or molecules influencing obesity or body weight/body 
mass regulations are not known. In addition, several single-gene mutations 
resulting in obesity have been described in mice, implicating genetic factors 
in the etiology of obesity (Friedman and Leibel, 1990, Cell 69: 217-220). 
In the ob mouse a single gene mutation (obese) results in profound obesity, 
which is accompanied by diabetes (Friedman et. al., 1991, Genomics 1 1: 
1054-1062). 
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Therefore, the technical problem underlying the present invention was to 
provide for means and methods for modulating (pathological) metabolic 
conditions influencing body-weight regulation and/or energy homeostatic 
circuits. The solution to said technical problem is achieved by providing the 

5 embodiments characterized in the claims. Accordingly, the present 
invention relates to novel functions of proteins and nucleic acids encoding 
these in body-weight regulation, energy homeostasis, metabolism, and 
obesity. The proteins disclosed herein and polynucleotides encoding thelse 
are thus suitable to investigate metabolic diseases and disorders. Further 

io new compositions are provided that are useful in diagnosis, treatment, and 
prognosis of metabolic diseases and disorders as described. 

KIAA0966 encodes for a Synaptojanin-like protein, the Sac 
domain-containing inositol phosphatase (hSac2). Synaptic vesicles are 
15 recycled with remarkable speed and precision in nerve terminals. A major 
recycling pathway involves clathrin-mediated endocytosis at endocytic 
zones located around sites of release. Different 'accessory' proteins linked 
to this pathway have been shown to alter the shape and composition of 
lipid membranes, to modify membrane-coat protein interactions, and to 
20 influence actin polymerization. These include the GTPase dynamin, the 
lysophosphatidic acid acyl transferase endophilin, and the phosphoinositide 
phosphatase synaptojanin {Brodin L. et al. f 2000, Curr Opin Neurobiol 
10(3):31 2-320). Studies on the endocytosis of synaptic vesicles have 
shown the essential roles of endophilin and synaptojanin in vesicle 
25 formation (see, Ringstad N. et al., 1999, Neuron 24(1): 143-1 54). The 
recessive suppressor of secretory defect in yeast Golgi and yeast actin 
function belongs to this family (Luo W. and Chang A., 1997, J Cell Biol 
138(4):731-746). This protein may be involved in the coordination of the 
activities of the secretory pathway and the actin cytoskeleton. Human 
30 synaptojanin, which may be localised on coated endocytic intermediates in 
nerve terminals also belongs to this family (Haffner C. et al., 1997, FEBS 
Lett 419(2-3):175-180). Studies on the endocytosis of synaptic vesicles 
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have shown the essential roles of endophilin and synaptojanin in vesicle 
formation (see, Ringstad IM. et al., 1999, Neuron 24(1):143-154). 



The human Sac domain-containing inositol phosphatase (hSac2) is 
ubiquitously expressed, but especially abundant in the brain, heart, skeletal 
muscle, and kidney. hSac2 protein exhibits 5-phosphatase activity specific 
for phosphatidylinositol 4,5-bisphosphate and phosphatidylinositol 
3,4,5-trisphosphate (Miriagawa T. et al., (2001) J Biol Chem 
276(25):2201 1-22015). 

Energy transduction in mitochondria requires the transport of many specific 
metabolites across the inner membrane of this eukaryotic organelle. The 
mitochondrial carrier family (MCF) consists of at least thirty-seven proteins. 
(Kuan J. and Saier M.H., 1 993, Crit Rev Biochem Mol Biol 28(3):209~233). 
The mitochondrial aspartate/glutamate carrier catalyzes an important step 
in both the urea cycle and the aspartate/malate NADH shuttle. Citrin and 
aralarl are homologous proteins belonging to the mitochondrial carrier 
family with EF-hand Ca 2+ binding motifs in their N-terminal domains. Citrin 
and aralarl are isoform Ca 2+ stimulated aspartate/glutamate transporters in 
mitochondria (Palmieri L. et al., 2001, EMBO J 20(1 8):5060-9). Solute 
carrier family 25, member 13 (SLC25A13) encodes a calcium-binding 
mitochondrial carrier protein, designated citrin. Mutations in the SLC25A1 3 
g ene | eaC j to adult-onset type II citrullinemia (Yasuda T. et al., 2000, Hum 
Genet 107(6):537-545). 

The held out wings (how) Drosophila gene encodes a RNA-binding protein 
involved in the control of muscular and cardiac activity. The how protein is 
localized to the nucleus, how is highly related to the mouse quaking gene 
which plays a role at least in myelination and that could serve to link a 
signal transduction pathway to the control of mRNA metabolism (Zaff ran S. 
et al., 1997, Development 124(10):2087-2098). Two isoforms of the 
Drosophila RNA binding protein, how, act in opposing directions to regulate 
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tendon cell differentiation (Nabel-Rosen H. et al., 2002, Dev Cell 2002 
Feb;2{2):183-193). The opposing activities of the How isoforms are 
manifested by differential rates of mRNA degradation of the target stripe 
mRNA. This mechanism is conserved, as the mammalian RNA binding 
Quaking proteins may similarly affect the levels of Krox20, a regulator of 
Schwann cell maturation. 

The mouse quaking (qk) gene is essential in both myelination and early 
embryogenesis. Its product, QKI, is an RNA-binding protein belonging to a 
growing protein family called STAR (signal transduction and activator of 
RNA) (Wu J. etal., 1999, J Biol Chem 274(41 ):29202-29210). Quaking is 
essential for blood vessel development (Noveroske J.K. et al., 2002, 
Genesis 32(3):21 8-230). 

The myelin basic protein (MBP) gene is expressed in oligodendrocytes and 
Schwann cells, and expression follows a tightly regulated developmental 
time course. Cell type- and developmental stage-specific expression of the 
MBP gene is regulated by a series of cis-acting elements located upstream 
of the transcription start site. Myelin gene expression factor-2 (Myef-2), a 
protein isolated from mouse brain represses transcription of the MBP gene. 
Myef-2 mRNA is developmentally regulated in mouse brain; its peak 
expression occurs at postnatal day 7, prior to the onset of MBP expression 
(Haas S. et al., 1995, J Biol Chem 270(21):1 2503-1 2510). 

MBP is a major component of the myelin sheath whose production is 
developmentally controlled during myelinogenesis. Programmed expression 
of the MBP gene is regulated at the level of transcription. The MB1 
regulatory motif plays an important role in transcription of the MBP 
promoter. The MB1 element contains a binding site for the repressor 
protein MyEF-2 (Myelin gene expression factor-2). MyEF-2 is involved in 
transcriptional regulation of the MBP gene during the course of brain 
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development (Muralidharan V. et al., 1997, J Cell Biochem 1997 Sep 
15;66(4):524-31). 

The Drosophila melanogaster gene couch potato (cpo, GadFly Accession 
Number CG 18434) encodes a putative nuclear RNA binding protein. The 
protein is expressed in the Drosophila embryo (embryonic central nervous 
system, embryonic peripheral nervous system, embryonic/larval midgut, 
glial cell and other tissues) (Harvie et al., 1998, Genetics 149(1): 
217-231). At least three protein isoforms (for example, Cpo 17, Cpo 61.1 
and Cpo 61.2) and 49 recorded mutant alleles have been described. 
Mutations have been isolated which affect the larval ventral ganglion and 
are recessive lethal in Drosophila. Mutant cpo flies exhibit an abnormal and 
hypoactive behavior (Bellen et al., 1992, Genetics 131: 365-375, and 
Bellen et al., 1992, Genes Dev. 6: 2125-2136). This invention describes as 
human homolog proteins to the Drosophila cpo encoded gene product the 
RNA-binding protein gene with multiple splicing and a hypothetical protein 
XP_091O97. No further information is available for the human homolog 
proteins from the prior art. 

Incomplete reduction of atmospheric oxygen generates potent oxidizing 
agents, including reactive oxygen species (ROS) and their toxic 
byproducts. Protection from ROS is mediated by nonenzymatic agents, 
enzymes, and low molecular weight reducing agents, such as thioredoxin. 
Under normal conditions, thioredoxin reductase reduces oxidized 
thioredoxin in the presence of NADPH. Reduced thioredoxin serves as an 
electron donor for thioredoxin peroxidase (peroxiredoxin) which 
consequently reduces H 2 0 2 to H 2 0 (Schallreuter K.U. and Wood J.M., 
2001, J Photochem Photobiol B 64(2-3): 179-1 84). Members of the 
peroxiredoxin family play an antioxidant protective role in various tissues 
under nonpathologic conditions and during inflammatory processes. 
Antioxidants govern intracellular reduction-oxidation (redox) status, which 
plays a critical role in NFKB (nuclear factror kappa-B) transcription factor 
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activation. Different antioxidants are selective for redox regulation of 
certain transcription factors. Peroxidases of the peroxiredoxin family reduce 
hydrogen peroxide H 2 0 2 and alkyl hydroperoxides to water and alcohol 
with the use of reducing equivalents derived from thiol-containing donor 
molecules. 

A family of highly conserved antioxidant enzymes, Peroxiredoxins (Prxs), 
has two major Prx subfamilies: one subfamily uses two conserved 
cysteines (2-Cys) and the other uses 1-Cys to scavenge reactive oxygen 
species (ROS). Four mammalian 2-Cys members (Prx l-IV) utilize 
thioredoxin as the electron donor for antioxidation. Prxs are capable of 
protecting cells from ROS insult and regulating the signal transduction 
pathways that utilize c-Abl, caspases, nuclear factor-kappaB (NF-kappaB) 
and activator protein-1 (AP-1) to influence cell growth and apoptosis. Prxs 
are also essential for red blood cell (RBC) differentiation and are capable of 
inhibiting human immunodeficiency virus (HIV) infection and organ 
transplant rejection (Butterfield L.H. et al., 1999, Antioxid Redox Signal 
1 (4):385-402). Distribution patterns indicate that Prxs are highly expressed 
in the tissues and cells at risk for diseases related to ROS toxicity, such as 
Alzheimer's and Parkinson's diseases and atherosclerosis. This correlation 
suggests that Prxs are protective against ROS toxicity, yet overwhelmed 
by oxidative stress in some cells (Butterfield L.H. et al., 1999, Antioxid 
Redox Signal 1 (4):385-402). Prxs tend to form large aggregates at high 
concentrations, a feature that may interfere with their normal protective 
function or may even render them cytotoxic. Imbalance in the expression 
of subtypes can also potentially increase their susceptibility to oxidative 
stress. Therefor Prxs may play a role in the cellular dysfunction of 
ROS-related diseases ranging from atherosclerosis to cancer to 
neurodegenerative diseases. 

The Drosophila gene with GadFly Accession Number CG1 4440 encodes for 
a protein which is most homologous to the human hypothetical protein 
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LOC55565 (GenBank Accession Number NPJD60000.1 for the protein, 
NM_017530 for the cDNA). No functional data are available for these 
proteins in the prior art. 

So far, it has not been described that a protein of the invention or a 
homologous protein is involved in the regulation of energy homeostasis and 
body-weight regulation and related disorders, and thus, no functions in 
metabolic diseases and other diseases as listed above have been 
discussed. In this invention we demonstrate that the correct gene dose of 
a protein of the invention is essential for maintenance of energy 
homeostasis. A genetic screen was used to identify that mutation of a 
gene encoding a protein of the invention or a homologous gene causes 
changes in the metabolism, in particular related to obesity, which is 
reflected by a significant change of triglyceride content, the major energy 
storage substance. 

Before the present proteins, nucleotide sequences, and methods are 
described, it is understood that this invention is not limited to the particular 
methodology, protocols, cell lines, vectors, and reagents described as 
these may vary. It is also to be understood that the terminology used 
herein is for the purpose of describing particular embodiments only, and is 
not intended to limit the scope of the present invention that will be limited 
only by the appended claims. Unless defined otherwise, all technical and 
scientific terms used herein have the same meanings as commonly 
understood by one of ordinary skill in the art to which this invention 
belongs. Although any methods and materials similar or equivalent to those 
described herein can be used in the practice or testing of the present 
invention, the preferred methods, devices, and materials are now 
described. All publications mentioned herein are incorporated herein by 
reference for the purpose of describing and disclosing the cell lines, 
vectors, and methodologies that are reported in the publications which 
might be used in connection with the invention. Nothing herein is to be 
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construed as an admission that the invention is not entitled to antedate 
such disclosure. 

The present invention discloses that CG7956, aralarl , how, CG9373, cpo, 
Jafrad, or CG14440 homologous proteins (herein referred to as "proteins 
of the invention" or "a protein of the invention") are regulating the energy 
homeostasis and fat metabolism especially the metabolism and storage of 
triglycerides, and polynucleotides, which identify and encode the proteins 
disclosed in this invention. The invention also relates to vectors, host cells, 
antibodies, and recombinant methods for producing the polypeptides and 
polynucleotides of the invention. The invention also relates to the use of 
these sequences in the diagnosis, study, prevention., and treatment of 
metabolic diseases and dysfunctions, including metabolic syndrome, 
obesity, or diabetes as well as related disorders such as eating disorder, 
cachexia, hypertension, coronary heart disease, hypercholesterolemia, 
dyslipidemia, osteoarthritis, or gallstones. 

GadFly Accession Number CG7956, aralarl (GadFly Accession Number 
CG2139), how (GadFly Accession Number CG10293), GadFly Accession 
Number CG9373, cpo (GadFly Accession Number CG31243 and 
CG18434), Jafrad (GadFly Accession Number CG1633), or GadFly 
Accession Number CG 14440 homologous proteins and nucleic acid 
molecules coding therefore are obtainable from insect or vertebrate 
species, e.g. mammals or birds. Particularly preferred are homologous 
nucleic acids, particularly nucleic acids encoding a human protein as 
described in TABLE 1 . 

The invention particularly relates to a nucleic acid molecule encoding a 
polypeptide contributing to regulating the energy homeostasis and the 
metabolism of triglycerides, wherein said nucleic acid molecule comprises 
(a) the nucleotide sequence of CG7956, aralarl, how, CG9373, cpo, 
Jafrad, or CG14440 or homologous nucleic acids, particularly 
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nucleic acids encoding a human protein as described in Table 1, 
and/or a sequence complementary thereto, 

(b) a nucleotide sequence which hybridizes at 50°C in a solution 
containing 1 x SSC and 0.1 % SDS to a sequence of (a), 

(c) a sequence corresponding to the sequences of (a) or (b) within the 
degeneration of the genetic code, 

(d) a sequence which encodes a polypeptide which is at least 85%, 
preferably at least 90%, more preferably at least 95%, more 
preferably at least 98% and up to 99,6% identical, to the amino acid 
sequences of CG7956, aralarl, how, CG9373, cpo, Jafrad, or 
CG 14440 homologous protein, preferably of a human homologous 
protein as described in Table 1 . 

(e) a sequence which differs from the nucleic acid molecule of (a) to (d) 
by mutation and wherein said mutation causes an alteration, 
deletion, duplication and/or premature stop in the encoded 
polypeptide or 

(f) a partial sequence of any of the nucleotide sequences of (a) to (e) 
having a length of 1 5 bases, preferably 20 bases, more preferably 
25 bases and most preferably at least 50 bases. 

The invention is based on the finding that CG7956, aralarl , how, CG9373, 
cpo, Jafrad, or CG 14440 and/or homologous proteins and the 
polynucleotides encoding these, are involved in the regulation of 
triglyceride storage and therefore energy homeostasis. The invention 
describes the use of these compositions for the diagnosis, study, 
prevention, or treatment of metabolic diseases or dysfunctions, including 
metabolic syndrome, obesity, or diabetes, as well as related disorders such 
as eating disorder, cachexia, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, or gallstones. 

Accordingly, the present invention relates to genes with novel functions in 
body-weight regulation, energy homeostasis, metabolism, and obesity, 
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functional fragments of said genes, polypeptides encoded by said genes or 
fragments thereof, and effectors/modulators thereof, e.g. antibodies, 
biologically active nucleic acids, such as antisense molecules, RNAi 
molecules or ribozymes, aptamers, peptides or low-molecular weight 
organic compounds recognizing said polynucleotides or polypeptides. 

The ability to manipulate and screen the genomes of model organisms such 
as the fly Drosophila melanogaster provides a powerful tool to analyze 
biological and biochemical processes that have direct relevance to more 
complex vertebrate organisms due to significant evolutionary conservation 
of genes, cellular processes, and pathways (see, for example, Adams M. 
D. et al., (2000) Science 287: 2185-2195). Identification of novel gene 
functions in model organisms can directly contribute to the elucidation of 
correlative pathways in mammals (humans) and of methods of modulating 
them. A correlation between a pathology model (such as changes in 
triglyceride levels as indication for metabolic syndrome including obesity) 
and the modified expression of a fly gene can identify the association of 
the human ortholog with the particular human disease. 

In one embodiment, a forward genetic screen is performed in fly displaying 
a mutant phenotype due to misexpression of a known gene (see, Johnston 
Nat Rev Genet 3: 176-188 (2002); Rorth P., (1996) Proc Natl Acad Sci U 
S A 93: 12418-12422). In this invention, we have used a genetic screen 
to identify mutations of the CG7956, aralarl , how, CG9373, cpo, Jafrad , 
or CG 14440 gene, or homologous genes that cause changes in the body 
weight, which are reflected by a significant change of triglyceride levels. 

Obese people mainly show a significant increase in the content of 
triglycerides. Triglycerides are the most efficient storage for energy in cells. 
In order to isolate genes with a function in energy homeostasis, several 
thousand proprietary and publicly available EP-lines were tested for their 
triglyceride content after a prolonged feeding period (see Examples for 
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more detail). Lines with significantly changed triglyceride content were 
selected as positive candidates for further analysis. The increase or 
decrease of triglyceride content due to the loss of a gene function suggests 
gene activities in energy homeostasis in a dose dependent manner that 
controls the amount of energy stored as triglycerides. 

In this invention, the content of triglycerides of a pool of flies with the 
same genotype was analyzed after prolonged feeding using a triglyceride 
assay. Male flies homozygous or heterozygous for the integration of 
vectors for Drosophila EP-lines were analyzed in assays measuring the 
triglyceride contents of these flies, illustrated in more detail in the 
Examples section. The results of the triglyceride content analysis are 
shown in Figures 1, 5, 9, 13, 17, 21, and 25, respectively. 

Genomic DNA sequences were isolated that are localized adjacent to the 
EP or PX vector integration. Using those isolated genomic sequences public 
databases like Berkeley Drosophila Genome Project (GadFly; see also 
FlyBase (1999) Nucleic Acids Research 27:85-88) were screened thereby 
identifying the integration sites of the vectors, and the corresponding 
genes, described in more detail in the Examples section. The molecular 
organization of the genes is shown in Figures 2, 6, 10, 14, 18, 22, and 
26, respectively. 

An additional screen using Drosophila mutants with modifications of the 
eye phenotype identified an interaction of cpo with adipose, a protein 
regulating, causing or contributing to obesity. An additional screen using 
Drosophila mutants with modifications of the eye phenotype identified a 
modification of UCP activity by cpo, thereby leading to an altered 
mitochondrial activity. These findings suggest the presence of similar 
activities of these described homologous proteins in humans that provides 
insight into diagnosis, treatment, and prognosis of metabolic disorders. 
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The Drosophila genes and proteins encoded thereby with functions in the 
regulation of triglyceride metabolism were further analysed in publicly 
available sequence databases (see Examples for more detail) and 
mammalian homologs were identified. 

The function of the mammalian homologs in energy homeostasis was 
further validated in this invention by analyzing the expression of the 
transcripts in different tissues and by analyzing the role in adipocyte 
differentiation. Expression profiling studies (see Examples for more detail) 

o confirm the particular relevance of the protein(s) of the invention as 
regulators of energy metabolism in mammals. Further, we show that the 
proteins of the invention are regulated by fasting and . by genetically 
induced obesity. In this invention, we used mouse models of insulin 
resistance and/or diabetes, such as mice carrying gene knockouts in the 

5 leptin pathway (for example, ob {leptin) or db (leptin receptor) mice) to 
study the expression of the protein of the invention. Such mice develop 
typical symptoms of diabetes, show hepatic lipid accumulation and 
frequently have increased plasma lipid levels (see Bruning et al, 1 998, Mol. 
Cell. 2:449-569). 

20 

Microarrays are analytical tools routinely used in bioanalysis. A microarray 
has molecules distributed over, and stably associated with, the surface of 
a solid support. The term "microarray" refers to an arrangement of a 
plurality of polynucleotides, polypeptides, antibodies, or other chemical 

25 compounds on a substrate. Microarrays of polypeptides, polynucleotides, 
and/or antibodies have been developed and find use in a variety of 
applications, such as monitoring gene expression, drug discovery, gene 
sequencing, gene mapping, bacterial identification, and combinatorial 
chemistry. One area in particular in which microarrays find use is in gene 

30 expression analysis (see Example 6). Array technology can be used to 
explore the expression of a single polymorphic gene or the expression 
profile of a large number of related or unrelated genes. When the 
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expression of a single gene is examined, arrays are employed to detect the 
expression of a specific gene or its variants. When an expression profile is 
examined, arrays provide a platform for identifying genes that are tissue 
specific, are affected by a substance being tested in a toxicology assay, 
are part of a signaling cascade, carry out housekeeping functions, or are 
specifically related to a particular genetic predisposition, condition, disease, 
or disorder. 

Microarrays may be prepared, used, and analyzed using methods known in 
the art (see for example, Brennan, T.M. et al. (1995) U.S. Patent No. 
5,474,796- Schena, M. et al. (1996) Proc. Natl. Acad. Sci. USA 
93:10614-10619; Baldeschweiler et al. (1995) PCT application 
W095/251116; Shalon, D. et al. (1995) PCT application W095/35505; 
Heller, R.A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:21502155; Heller, 
M.J. et al. (1997) U.S. Patent No. 5,605,662). Various types of 
microarrays are well known and thoroughly described in Schena, M., ed. 
(1999; DNA Microarrays: A Practical Approach, Oxford University Press, 
London). 

In further embodiments, oligonucleotides or longer fragments derived from 
any of the polynucleotides described herein may be used as elements on a 
microarray. The microarray can be used in transcript imaging techniques, 
which monitor the relative expression levels of large numbers of genes 
simultaneously as described below. The microarray may also be used to 
identify genetic variants, mutations, and polymorphisms. This information 
may be used to determine gene function, to understand the genetic basis 
of a disorder, to diagnose a disorder, to monitor progression/regression of 
disease as a function of gene expression, and to develop and monitor the 
activities of therapeutic agents in the treatment of disease. In particular, 
this information may be used to develop a pharmacogenomic profile of a 
patient in order to select the most appropriate and effective treatment 
regimen for that patient. For example, therapeutic agents, which are highly 
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effective and display the fewest side effects may be selected for a patient 
based on his/her pharmacogenomic profile. 

As determined by Microarray analysis, Quaking 6 (QKI6), RNA binding 
protein HQK-7B, RNA binding protein with multiple splicing (RBPMS), 
Peroxiredoxin 1 (PRDX1), and hypothetical protein LOC55565 show 
differential expression in human primary adipocytes. Thus, Quaking 6 
(QKI6), RNA binding protein HQK-7B, RNA binding protein with multiple 
splicing (RBPMS), Peroxiredoxin 1 (PRDX1), and hypothetical protein 
LOC55565 are strong candidates for the manufacture of a pharmaceutical 
composition and a medicament for the treatment of conditions related to 
human metabolism, such as obesity, diabetes, and/or metabolic syndrome. 

The invention also encompasses polynucleotides that encode a protein of 
the invention or a homologous protein. Accordingly, any nucleic acid 
sequence, which encodes the amino acid sequences of a protein of the 
invention or a homologous protein, can be used to generate recombinant 
molecules that express a protein of the invention or a homologous protein. 
In a particular embodiment, the invention encompasses nucleic acids 
encoding Drosophila CG7956, aralarl , how, CG9373, cpo, Jafrad, or 
CG14440 or human CG7956, aralarl, how, CG9373, cpo, Jafracl, or 
CG 14440 homologs; referred to herein as the proteins of the invention. It 
will be appreciated by those skilled in the art that as a result of the 
degeneracy of the genetic code, a multitude of nucleotide sequences 
encoding the proteins, some bearing minimal homology to the nucleotide 
sequences of any known and naturally occurring gene, may be produced. 
Thus, the invention contemplates each and every possible variation of 
nucleotide sequence that could be made by selecting combinations based 
on possible codon choices. 

Also encompassed by the invention are polynucleotide sequences that are 
capable of hybridizing to the claimed nucleotide sequences, and in 
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particular, those of the polynucleotides encoding CG7956 # aralarl , how, 
CG9373, cpo, Jafrad, or CG14440, or a homologous protein, preferably 
a human homologous protein as described in Table 1, under various 
conditions of stringency. Hybridization conditions are based on the melting 
temperature (Tm) of the nucleic acid binding complex or probe, as taught 
in Wahl, G. M. and S. L. Berger (1987: Methods Enzymol. 152:399-407) 
and Kimmel, A. R. (1987; Methods Enzymol. 152:507-511), and may be 
used at a defined stringency. Preferably, hybridization under stringent 
conditions means that after washing for 1 h with 1 x SSC and 0.1 % SDS 
at 50°C, preferably at 55°C, more preferably at 62°C and most preferably 
at 68°C, particularly for 1 h in 0.2 x SSC and 0.1% SDS at 50°C, 
preferably at 55 °C, more preferably at 62°C and most preferably at 68°C, 
a positive hybridization signal is observed. Altered nucleic acid sequences 
encoding the proteins which are encompassed by the invention include 
deletions, insertions, or substitutions of different nucleotides resulting in a 
polynucleotide that encodes the same or a functionally equivalent protein. 

The encoded proteins may also contain deletions, insertions, or 
substitutions of amino acid residues, which produce a silent change and 
result in functionally equivalent proteins. Deliberate amino acid 
substitutions may be made on the basis of similarity in polarity, charge, 
solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of 
the residues as long as the biological activity of the protein is retained. 
Furthermore, the invention relates to peptide fragments of the proteins or 
derivatives of such fragments such as cyclic peptides, retro-inverso 
peptides or peptide mimetics, wherein the peptides or derivatives usually 
have a length of at least four, preferably at least six and up to 50 amino 
acids. 

Also included within the scope of the present invention are alleles of the 
genes encoding a protein of the invention or a homologous protein. As 
used herein, an "allele" or "allelic sequence" is an alternative form of the 
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gene, which may result from at least one mutation in the nucleic acid 
sequence. Alleles may result in altered mRNAs or polypeptides whose 
structures or function may or may not be altered. Any given gene may 
have none, one, or many allelic forms. Common mutational changes, which 
give rise to alleles, are generally ascribed to natural deletions, additions, or 
substitutions of nucleotides. Each of these types of changes may occur 
alone, or in combination with the others, one or more times in a given 
sequence. 

The nucleic acid sequences encoding a protein of the invention or a 
homologous protein may be extended utilizing a partial nucleotide sequence 
and employing various methods known in the art to detect upstream 
sequences such as promoters and regulatory elements. For example, one 
method which may be employed, "restriction-site" PCR, uses universal 
primers to retrieve unknown sequence adjacent to a known locus (Sarkar, 
G. (1 993) PCR Methods Applic. 2:31 8-322). Inverse PCR may also be used 
to amplify or extend sequences using divergent primers based on a known 
region (Triglia, T. et al. (1988) Nucleic Acids Res. 16:8186). Another 
method which may be used is capture PCR which involves PCR 
amplification of DNA fragments adjacent to a known sequence in human 
and yeast artificial chromosome DNA (Lagerstrom, M. et al. (PCR Methods 
Applic. 1:111-119). Another method which may be used to retrieve 
unknown sequences is that of Parker, J. D. et al. (1991; Nucleic Acids 
Res. 19:3055-3060). Additionally, one may use PCR, nested primers, and 
PROMOTERFINDER libraries to walk in genomic DNA (Clontech, Palo Alto, 
Calif.). This process avoids the need to screen libraries and is useful in 
finding intron/exon junctions. 

In order to express a biologically active protein, the nucleotide sequences 
encoding the proteins, may be inserted into appropriate expression vectors, 
i.e., a vector, which contains the necessary elements for the transcription 
and translation of the inserted coding sequence. Methods, which are well 
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known to those skilled in the art, may be used to construct expression 
vectors containing sequences encoding the proteins and appropriate 
transcriptional and translational control elements. These methods include in 
vitro recombinant DNA techniques, synthetic techniques, and in vivo 
genetic recombination. Such techniques are described in Sambrook, J. et 
al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor 
Press, Plainview, N.Y., and Ausubel, F. M. et al. (1989) Current Protocols 
in Molecular Biology, John Wiley & Sons, New York, N.Y. 

In a further embodiment of the invention, nucleic acid sequences encoding 
the sequences of the invention may be ligated to a heterologous sequence 
to encode a fusion protein. Heterologous sequences are preferably located 
at the N-and/or C -terminus of the fusion protein. 

A variety of expression vector/host systems may be utilized to contain and 
express sequences encoding the proteins. These include, but are not 
limited to, micro-organisms such as bacteria transformed with recombinant 
bacteriophage, plasmid, or cosmid DNA expression vectors; yeast 
transformed with yeast expression vectors; insect cell systems infected 
with virus expression vectors (e.g., baculovirus); plant cell systems 
transformed with virus expression vectors (e.g., cauliflower mosaic virus, 
CaMV; tobacco mosaic virus, TMV) or with bacterial expression vectors 
(e.g., Ti or PBR322 plasmids); or animal, e.g. mammalian cell systems. 

The presence of polynucleotide sequences encoding a protein of the 
invention or a homologous protein can be detected by DNA-DNA or 
DNA-RNA hybridization or amplification using probes or portions or 
fragments of polynucleotides encoding a protein of the invention or a 
homologous protein. Nucleic acid amplification based assays involve the 
use of oligonucleotides or oligomers based on the sequences specific for 
the gene to detect transformants containing DNA or RNA encoding the 
corresponding protein. As used herein "oligonucleotides" or "oligomers" 



WO 03/092715 



PCT7EP03/04650 



- 19 - 

refer to a nucleic acid sequence of at least about 10 nucleotides and as 
many as about 60 nucleotides, preferably about 1 5 to 30 nucleotides, and 
more preferably about 20-25 nucleotides, which can be used as a probe, 
primer or amplimer. 

A variety of protocols for detecting and measuring the expression of 
proteins, using either polyclonal or monoclonal antibodies specific for the 
protein are known in the art. Examples include enzyme-linked 
immunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescence 
activated cell sorting (FACS). A two-site, monoclonal-based immunoassay 
utilizing monoclonal antibodies reactive to two non-interfering epitopes on 
the protein is preferred, but a competitive binding assay may be employed. 
These and other assays are described, among other places, in Hampton, R. 
et al. (1990; Serological Methods, a Laboratory Manual, APS Press, St 
Paul, Minn.) and Maddox, D. E. et al. (1983; J. Exp. Med. 
158:1211-1216). 

A wide variety of labels and conjugation techniques are known by those 
skilled in the art and may be used in various nucleic acid and proteins, e.g. 
immunological assays. Means for producing labeled hybridization or PGR 
probes for detecting sequences related to polynucleotides encoding a 
protein of the invention or a homologous protein include oligo-labeling, nick 
translation, end-labeling of RNA probes or PCR amplification using a labeled 
nucleotide. These procedures may be conducted using a variety of 
commercially available kits (Pharmacia & Upjohn, (Kalamazoo, Mich.); 
Promega (Madison Wis.); and U.S. Biochemical Corp., (Cleveland, Ohio). 

Suitable reporter molecules or labels, which may be used for nucleic acid 
and protein assays, include radionuclides, enzymes, fluorescent, 
chemiluminescent, or chromogenic agents as well as substrates, 
co-factors, inhibitors, magnetic particles, and the like. 
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Host cells transformed with nucleotide sequences encoding the protein 
may be cultured under conditions suitable for the expression and recovery 
of the protein from cell culture. The protein produced by a recombinant cell 
may be secreted or contained intracellular^ depending on the sequence 

5 and/or the vector used. As will be understood by those of skill in the art, 
expression vectors containing polynucleotides which encode the protein 
may be designed to contain signal sequences, which direct secretion of the 
protein through a prokaryotic or eukaryotic cell membrane. Other 
recombinant constructions may be used to join sequences encoding the 

10 protein to nucleotide sequence encoding a polypeptide domain, which will 
facilitate purification of soluble proteins. Such purification facilitating 
domains include, but are not limited to, metal chelating peptides such as 
histidine-tryptophan modules that allow purification on immobilized metals, 
protein A domains that allow purification on immobilized immunoglobulin, 

15 and the domain utilized in the FLAG extension/affinity purification system 
(Immunex Corp., Seattle, Wash.) The inclusion of cleavable linker 
sequences such as those specific for Factor XA or Enterokinase 
(Invitrogen, San Diego, Calif.) between the purification domain and the 
desired protein may be used to facilitate purification. 

20 

Diagnostics and Therapeutics 

The data disclosed in this invention show that the nucleic acids and 
25 proteins of the invention and effectors/modulators thereof are useful in 
diagnostic and therapeutic applications implicated, for example but not 
limited to, in metabolic diseases or dysfunctions, including metabolic 
syndrome, obesity, or diabetes, as well as related disorders such as eating 
disorder, cachexia, hypertension, coronary heart disease, 
30 hypercholesterolemia, dyslipidemia, osteoarthritis, or gallstones. Hence, 
diagnostic and therapeutic uses for the nucleic acids and proteins of the 
invention are, for example but not limited to, the following: (i) protein 
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therapy, (ii) small molecule drug target, (Hi) antibody target (therapeutic, 
diagnostic, drug targeting/cytotoxic antibody), (iv) diagnostic and/or 
prognostic marker, (v) gene therapy (gene delivery/gene ablation), (vi) 
research tools, and (vii) tissue regeneration in vitro and in vivo 
5 (regeneration for all these tissues and cell types composing these tissues 
and cell types derived from these tissues). 

The nucleic acids and proteins of the invention are useful in diagnostic and 
therapeutic applications implicated in various applications as described 

o below. For example, but not limited to, c DIM As encoding the proteins of the 
invention and particularly their human homologues may be useful in gene 
therapy, and the proteins of the invention and particularly their human 
homologues may be useful when administered to a subject in need thereof. 
By way of non-limiting example, the compositions of the present invention 

5 will have efficacy for treatment of patients suffering from, for example, but 
not limited to, in metabolic disorders as described above. 

The nucleic acid sequence encoding a protein of the invention, or a 
homologous protein, or a functional fragments thereof, may further be 
20 useful in diagnostic applications, wherein the presence or amount of the 
nucleic acids or the proteins are to be assessed. These materials are further 
useful in the generation of antibodies that bind immunospecifically to the 
novel substances of the invention for use in therapeutic or diagnostic 
methods. 

25 

For example, in one aspect, antibodies which are specific for a protein of 
the invention or a homologous protein may be used directly as an 
antagonist, or indirectly as a targeting or delivery mechanism for bringing 
a pharmaceutical agent to cells or tissue which express the protein. The 
30 antibodies may be generated using methods that are well known in the art. 
Such antibodies may include, but are not limited to, polyclonal, 
monoclonal, chimerical, single chain, Fab fragments, and fragments 
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produced by a Fab expression library. Neutralising antibodies, (i.e., those 
which inhibit dimer formation) are especially preferred for therapeutic use. 

For the production of antibodies, various hosts including goats, rabbits, 
rats, mice, humans, and others, may be immunized by injection with the 
protein or any fragment or oligopeptide thereof which has immunogenic 
properties. Depending on the host species, various adjuvants may be used 
to increase immunological response. It is preferred that the peptides, 
fragments, or oligopeptides used to induce antibodies to the protein have 
an amino acid sequence consisting of at least five amino acids, and more 
preferably at least 10 amino acids. 

Monoclonal antibodies to the proteins may be prepared using any 
technique that provides for the production of antibody molecules by 
continuous cell lines in culture. These include, but are not limited to, the 
hybridoma technique, the human B-cell hybridoma technique, and the 
EBV-hybridoma technique (Kohler, G. et al. (1975) Nature 256:495-497; 
Kozbor, D. et al. (1985) J. Immunol. Methods 81:31-42; Cote, R. J. et al. 
Proc. Natl. Acad. Sci. 80:2026-2030; Cole, S. P. et al. (1984) Mol. Cell 
Biol. 62:109-120). 

In addition, techniques developed for the production of "chimeric 
antibodies", the splicing of mouse antibody genes to human antibody 
genes to obtain a molecule with appropriate antigen specificity and 
biological activity can be used (Morrison, S. L. et al. (1984) Proc. Natl. 
Acad. Sci. 81:6851-6855; Neuberger, M. S. et al (1984) Nature 
312:604-608; Takeda, S. et a!. (1985) Nature 314:452-454). 
Alternatively, techniques described for the production of single chain 
antibodies may be adapted, using methods known in the art, to produce 
single chain antibodies specific for a protein of the invention or a 
homologous protein. Antibodies with related specificity, but of distinct 
idiotypic composition, may be generated by chain shuffling from random 
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combinatorial immunoglobulin libraries (Burton, D. R. (1991) Proc. Natl. 
Acad. Sci. 88:1 1 120-3). Antibodies may also be produced by inducing in 
vivo production in the lymphocyte population or by screening recombinant 
immunoglobulin libraries or panels of highly specific binding reagents as 
disclosed in the literature (Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci. 
86:3833-3837; Winter, G. et al. (1991) Nature 349:293-299). 

Antibody fragments which contain specific binding sites for the proteins 
may also be generated. For example, such fragments include, but are not 
limited to, the F(ab') 2 fragments which can be produced by Pepsin 
digestion of the antibody molecule and the Fab fragments which can be 
generated by reducing the disulfide bridges of F(ab') 2 fragments. 
Alternatively, Fab expression libraries may be constructed to allow rapid 
and easy identification of monoclonal Fab fragments with the desired 
specificity (Huse, W. D. et al. (1989) Science 254:1275-1281). 

Various immunoassays may be used for screening to identify antibodies 
having the desired specificity. Numerous protocols for competitive binding 
and immunoradiometric assays using either polyclonal or monoclonal 
antibodies with established specificities are well known in the art. Such 
immunoassays typically involve the measurement of complex formation 
between the protein and its specific antibody. A two-site, 
monoclonal-based immunoassay utilising monoclonal antibodies reactive to 
two non-interfering protein epitopes are preferred, but a competitive 
binding assay may also be employed (Maddox, supra). 

In another embodiment of the invention, the polynucleotides or fragments 
thereof, or nucleic acid effector molecules such as antisense molecules, 
aptamers, RNAi molecules or ribozymes may be used for therapeutic 
purposes. In one aspect, aptamers, i.e. nucleic acid molecules, which are 
capable of binding to a protein of the invention and modulating its activity 
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may be generated by a screening and selection procedure involving the use 
of combinatorial nucleic acid libraries. 

In a further aspect, antisense molecules may be used in situations in which 
it would be desirable to block the transcription of the mRNA. In particular, 
cells may be transformed with sequences complementary to 
polynucleotides encoding a protein of the invention or a homologous 
protein. Thus, antisense molecules may be used to modulate/effect protein 
activity, or to achieve regulation of gene function. Such technology is now 
well know in the art, and sense or antisense oligomers or larger fragments, 
can be designed from various locations along the coding or control regions 
of sequences encoding the proteins. Expression vectors derived from 
retroviruses, adenovirus, herpes or vaccinia viruses, or from various 
bacterial plasmids may be used for delivery of nucleotide sequences to the 
targeted organ, tissue or cell population. Methods, which are well known 
to those skilled in the art, can be used to construct recombinant vectors, 
which will express antisense molecules complementary to the 
polynucleotides of the genes encoding a protein of the invention or a 
homologous protein. These techniques are described both in Sambrook et 
al. (supra) and in Ausubel et al. (supra). Genes encoding a protein of the 
invention or a homologous protein can be turned off by transforming a cell 
or tissue with expression vectors which express high levels of 
polynucleotide which encodes a protein of the invention or a homologous 
protein or a functional fragment thereof. Such constructs may be used to 
introduce untranslatable sense or antisense sequences into a cell. Even in 
the absence of integration into the DNA, such vectors may continue to 
transcribe RNA molecules until they are disabled by endogenous nucleases. 
Transient expression may last for a month or more with a non-replicating 
vector and even longer if appropriate replication elements are part of the 
vector system. 
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As mentioned above, modifications of gene expression can be obtained by 
designing antisense molecules, e.g. DNA, RNA, or nucleic acid analogues 
such as PNA, to the control regions of the genes encoding a protein of the 
invention or a homologous protein, i.e., the promoters, enhancers, and 
introns. Oligonucleotides derived from the transcription initiation site, e.g., 
between positions -1 0 and + 1 0 from the start site, are preferred. Similarly, 
inhibition can be achieved using "triple helix" base-pairing methodology. 
Triple helix pairing is useful because it cause inhibition of the ability of the 
double helix to open sufficiently for the binding of polymerases, 
transcription factors, or regulatory molecules. Recent therapeutic advances 
using triplex DNA have been described in the literature (Gee, J. E. et al. 
(1994) In; Huber, B. E. and B. I. Carr, Molecular and Immunologic 
Approaches, Futura Publishing Co., Mt. Kisco, N.Y.). The antisense 
molecules may also be designed to block translation of mRNA by 
preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the 
specific cleavage of RNA. The mechanism of ribozyme action involves 
sequence-specific hybridization of the ribozyme molecule to complementary 
target RNA, followed by endonucleolytic cleavage. Examples, which may 
be used, include engineered hammerhead motif ribozyme molecules that 
can be specifically and efficiently catalyze endonucleolytic cleavage of 
sequences encoding a protein of the invention or a homologous protein. 
Specific ribozyme cleavage sites within any potential RNA target are 
initially identified by scanning the target molecule for ribozyme cleavage 
sites which include the following sequences: GUA, GUU, and GUC. Once 
identified, short RNA sequences of between 15 and 20 ribonucleotides 
corresponding to the region of the target gene containing the cleavage site 
may be evaluated for secondary structural features which may render the 
oligonucleotide inoperable. The suitability of candidate targets may also be 
evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 
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Nucleic acid effector molecules, e.g. antisense molecules and ribozymes of 
the invention may be prepared by any method known in the art for the 
synthesis of nucleic acid molecules. These include techniques for 
chemically synthesizing oligonucleotides such as solid phase 
phosphoramidite chemical synthesis. Alternatively, RNA molecules may be 
generated by in vitro and in vivo transcription of DNA sequences encoding 
a protein of the invention or a homologous protein. Such DNA sequences 
may be incorporated into a variety of vectors with suitable RNA 
polymerase promoters such as T7 or SP6. Alternatively, these cDNA 
constructs that synthesize antisense RNA constitutively or inducibly can be 
introduced into cell lines, cells, or tissues. RNA molecules may be modified 
to increase intracellular stability and half-life. Possible modifications 
include, but are not limited to, the addition of flanking sequences at the 5' 
and/or 3' ends of the molecule or the use of phosphorothioate or 2' 
O-methyl rather than phosphodiesterase linkages within the backbone of 
the molecule. This concept is inherent in the production of PNAs and can 
be extended in all of these molecules by the inclusion of non-traditional 
bases such as inosine, queosine, and wybutosine, as well as acetyl-, 
methyl-, thio-, and similarly modified forms of adenine, cytidine, guanine, 
thymine, and uridine which are not as easily recognized by endogenous 
endonucleases. 

Many methods for introducing vectors into cells or tissues are available and 
equally suitable for use in vivo, in vitro, and ex vivo. For ex vivo therapy, 
vectors may be introduced into stem cells taken from the patient and 
clonally propagated for autologous transplant back into that same patient. 
Delivery by transfection and by liposome injections may be achieved using 
methods, which are well known in the art. Any of the therapeutic methods 
described above may be applied to any suitable subject including, for 
example, mammals such as dogs, cats, cows, horses, rabbits, monkeys, 
and most preferably, humans. 
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An additional embodiment of the invention relates to the administration of 
a pharmaceutical composition, in conjunction with a pharmaceutical^ 
acceptable qarrier, for any of the therapeutic effects discussed above. 
Such pharmaceutical compositions may consist of a protein of the 
invention or a homologous nucleic acid sequence or protein, antibodies to 
a protein of the invention or a homologous protein, mimetics, agonists, 
antagonists, or inhibitors of a protein of the invention or a homologous 
protein or nucleic acid sequence. The compositions may be administered 
alone or in combination with at least one other agent, such as stabilizing 
compound, which may be administered in any sterile, biocompatible 
pharmaceutical carrier, including, but not limited to, saline, buffered saline, 
dextrose, and water. The compositions may be administered to a patient 
alone, or in combination with other agents, drugs or hormones. The 
pharmaceutical compositions utilized in this invention may be administered 
by any number of routes including, but not limited to, oral, intravenous, 
intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, 
transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, 
sublingual, or rectal means. 

In addition to the active ingredients, these pharmaceutical compositions 
may contain suitable pharmaceutically-acceptable carriers comprising 
excipients and auxiliaries, which facilitate processing of the active 
compounds into preparations which, can be used pharmaceutical^. Further 
details on techniques for formulation and administration may be found in 
the latest edition of Remington's Pharmaceutical Sciences (Maack 
Publishing Co., Easton, Pa.). 

The pharmaceutical compositions of the present invention may be 
manufactured in a manner that is known in the art, e.g., by means of 
conventional mixing, dissolving, granulating, dragee-making, levigating, 
emulsifying, encapsulating, entrapping, or lyophilizing processes. The 
pharmaceutical composition may be provided as a salt and can be formed 
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with many acids. After pharmaceutical compositions have been prepared, 
they can be placed in an appropriate container and labeled for treatment of 
an indicated condition. For administration of proteins, such labeling would 
include amount, frequency, and method of administration. 

Pharmaceutical compositions suitable for use in the invention include 
compositions wherein the active ingredients are contained in an effective 
amount to achieve the intended purpose. The determination of an effective 
dose is well within the capability of those skilled in the art. For any 
compounds, the therapeutically effective dose can be estimated initially 
either in cell culture assays, e.g., of preadipocyte cell lines, or in animal 
models, usually mice, rabbits, dogs, or pigs. The animal model may also be 
used to determine the appropriate concentration range and route of 
administration. Such information can then be used to determine useful 
doses and routes for administration in humans. A therapeutically effective 
dose refers to that amount of active ingredient, for example a protein of 
the invention or a homologous protein or nucleic acid sequence or 
functional fragment thereof, or antibodies, which is sufficient for treating 
a specific condition. Therapeutic efficacy and toxicity may be determined 
by standard pharmaceutical procedures in cell cultures or experimental 
animals, e.g., ED50 (the dose therapeutically effective in 50% of the 
population) and LD50 (the dose lethal to 50% of the population). The dose 
ratio between therapeutic and toxic effects is the therapeutic index, and it 
can be expressed as the ratio, LD50/ED50. Pharmaceutical compositions, 
which exhibit large therapeutic indices, are preferred. The data obtained 
from cell culture assays and animal studies is used in formulating a range 
of dosage for human use. The dosage contained in such compositions is 
preferably within a range of circulating concentrations that include the 
ED50 with little or no toxicity. The dosage varies within this range 
depending upon the dosage from employed, sensitivity of the patient, and 
the route of administration. The exact dosage will be determined by the 
practitioner, in light of factors related to the subject that requires 
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treatment. Dosage and administration are adjusted to provide sufficient 
levels of the active moiety or to maintain the desired effect. Factors, which 
may be taken into account, include the severity of the disease state, 
general health of the subject, age, weight, and gender of the subject, diet, 
time and frequency of administration, drug combination(s), reaction 
sensitivities, and tolerance/response to therapy. Long-acting 
pharmaceutical compositions may be administered every 3 to 4 days, every 
week, or once every two weeks depending on half-life and clearance rate 
of the particular formulation. Normal dosage amounts may vary from 0. 1 to 
100,000 micrograms, up to a total dose of about 1 g, depending upon the 
route of administration. Guidance as to particular dosages and methods of 
delivery is provided in the literature and generally available to practitioners 
in the art. Those skilled in the art employ different formulations for 
nucleotides than for proteins or their inhibitors. Similarly, delivery of 
polynucleotides or polypeptides will be specific to particular cells, 
conditions, locations, etc. 

In another embodiment, antibodies which specifically bind to a protein of 
the invention may be used for the diagnosis of conditions or diseases 
characterized by or associated with over- or underexpression of a protein 
of the invention or a homologous protein, or in assays to monitor patients 
being treated with a protein of the invention or a homologous protein, 
agonists, antagonists or inhibitors. The antibodies useful for diagnostic 
purposes may be prepared in the same manner as those described above 
for therapeutics. Diagnostic assays include methods which utilize the 
antibody and a label to detect the protein in human body fluids or extracts 
of cells or tissues. The antibodies may be used with or without 
modification, and may be labeled by joining them, either covalently or 
non-covalently, with a reporter molecule. A wide variety of reporter 
molecules which are known in the art may be used several of which are 
described above. 
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A variety of protocols including ELISA, RIA, and FACS for measuring 
proteins are known in the art and provide a basis for diagnosing altered or 
abnormal levels of gene expression. Normal or standard values for gene 
expression are established by combining body fluids or cell extracts taken 
from normal mammalian subjects, preferably human, with antibodies to the 
protein under conditions suitable for complex formation. The amount of 
standard complex formation may be quantified by various methods, but 
preferably by photometric means. Quantities of protein expressed in control 
and disease, samples e.g. from biopsied tissues are compared with the 
standard values. Deviation between standard and subject values 
establishes the parameters for diagnosing disease. 

In another embodiment of the invention, the polynucleotides specific for a 
protein of the invention or a homologous protein may be used for 
diagnostic purposes. The polynucleotides, which may be used, include 
oligonucleotide sequences, antisense RNA and DNA molecules, and PNAs. 
The polynucleotides may be used to detect and quantitate gene expression 
in biopsied tissues in which gene expression may be correlated with 
disease. The diagnostic assay may be used to distinguish between 
absence, presence, and excess gene expression, and to monitor regulation 
of protein levels during therapeutic intervention. 

In one aspect, hybridization with PCR probes which are capable of 
detecting polynucleotide sequences, including genomic sequences, 
encoding a protein of the invention or a homologous protein or closely 
related molecules, may be used to identify nucleic acid sequences which 
encode the respective protein. The hybridization probes of the subject 
invention may be DNA or RNA and are preferably derived from the 
nucleotide sequence of the polynucleotide encoding a CG7956, aralarl, 
how, CG9373, cpo, Jafrad, or CG14440 homologous protein, preferably 
a human homologous protein as described in Table 1 or from a genomic 
sequence including promoter, enhancer elements, and introns of the 
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naturally occurring gene. Means for producing specific hybridization probes 
for DNAs encoding a protein of the invention or a homologous protein 
include the cloning of nucleic acid sequences specific for a protein of the 
invention or a homologous protein into vectors for the production of mRNA 

5 probes. Such vectors are known in the art, commercially available, and 
may be used to synthesize RNA probes in vitro by means of the addition of 
the appropriate RNA polymerases and the appropriate labeled nucleotides. 
Hybridization probes may be labeled by a variety of reporter groups, for 
example, radionuclides such as 32 P or 35 S, or enzymatic labels, such as 

10 alkaline phosphatase coupled to the probe via avidin/biotin coupling 
systems, and the like. 

Polynucleotide sequences specific for a protein of the invention or 
homologous nucleic acids may be used for the diagnosis of conditions or 

15 diseases, which are associated with the expression of the proteins. 
Examples of such conditions or diseases include, but are not limited to, 
metabolic diseases and disorders, including obesity and diabetes. 
Polynucleotide sequences specific for a protein of the invention or a 
homologous protein may also be used to monitor the progress of patients 

20 receiving treatment for metabolic diseases and disorders, including obesity 
and diabetes. The polynucleotide sequences may be used in Southern or 
Northern analysis, dot blot, or other membrane-based technologies; in PCR 
technologies; or in dip stick, pin, ELISA or chip assays utilizing fluids or 
tissues from patient biopsies to detect altered gene expression. Such 

25 qualitative or quantitative methods are well known in the art. 

In a particular aspect, the nucleotide sequences specific for a protein of the 
invention or homologous nucleic acids may be useful in assays that detect 
activation or induction of various metabolic diseases or dysfunctions, 
30 including metabolic syndrome, obesity, or diabetes. The nucleotide 
sequences may be labeled by standard methods, and added to a fluid or 
tissue sample from a patient under conditions suitable for the formation of 
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hybridization complexes. After a suitable incubation period, the sample is 
washed and the signal is quantitated and compared with a standard value. 
The presence of the associated disease. Such assays may also be used to 
evaluate the efficacy of a particular therapeutic treatment regimen in 
animal studies, in clinical trials, or in monitoring the treatment of an 
individual patient. 

In order to provide a basis for the diagnosis of a disease associated with 
expression of a protein of the invention or a homologous protein, a normal 
or standard profile for expression is established. This may be accomplished 
by combining body fluids or cell extracts taken from normal subjects, either 
animal or human, with a sequence, or a fragment thereof, which is specific 
for nucleic acids encoding a protein of the invention or homologous nucleic 
acids, under conditions suitable for hybridization or amplification. Standard 
hybridization may be quantified by comparing the values obtained from 
normal subjects with those from an experiment where a known amount of 
a substantially purified polynucleotide is used. Standard values obtained 
from normal samples may be compared with values obtained from samples 
from patients who are symptomatic for disease. Deviation between 
standard and subject values is used to establish the presence of disease. 
Once disease is established and a treatment protocol is initiated, 
hybridization assays may be repeated on a regular basis to evaluate 
whether the level of expression in the patient begins to approximate that, 
which is observed in the normal patient. The results obtained from 
successive assays may be used to show the efficacy of treatment over a 
period ranging from several days to months. 

With respect to metabolic diseases or dysfunctions, including metabolic 
syndrome, obesity, or diabetes, the presence of a relatively high amount of 
transcript in biopsied tissue from an individual may indicate a predisposition 
for the development of the disease, or may provide a means for detecting 
the disease prior to the appearance of actual clinical symptoms. A more 
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definitive diagnosis of this type may allow health professionals to employ 
preventative measures or aggressive treatment earlier thereby preventing 
the development or further progression of the metabolic diseases and 
disorders. Additional diagnostic uses for oligonucleotides designed from the 
sequences encoding a protein of the invention or a homologous protein 
may involve the use of PCR. Such oligomers may be chemically 
synthesized, generated enzymatically, or produced from a recombinant 
source. Oligomers will preferably consist of two nucleotide sequences, one 
with sense orientation (5prime.fwdarw.3prime) and another with antisense 
(3prime.rarw.5prime), employed under optimized conditions for 
identification of a specific gene or condition. The same two oligomers, 
nested sets of oligomers, or even a degenerate pool of oligomers may be 
employed under less stringent conditions for detection and/or quantification 
of closely related DNA or RNA sequences. 

Methods which may also be used to quantitate the expression of a protein 
of the invention or a homologous protein include radiolabeling or 
biotinylating nucleotides, coamplification of a control nucleic acid, and 
standard curves onto which the experimental results are interpolated 
(Melby, P. C. etal. (1993) J. Immunol. Methods, 159:235-244; Duplaa, C. 
et al. (1 993) Anal. Biochem. 212:229-236). The speed of quantification of 
multiple samples may be accelerated by running the assay in an ELISA 
format where the oligomer of interest is presented in various dilutions and 
a spectrophotometric or colorimetric response gives rapid quantification. 

In another embodiment of the invention, the nucleic acid sequences which 
are specific for a protein of the invention or homologous nucleic acids may 
also be used to generate hybridization probes, which are useful for 
mapping the naturally occurring genomic sequence. The sequences may be 
mapped to a particular chromosome or to a specific region of the 
chromosome using well known techniques. Such techniques include FISH, 
FACS, or artificial chromosome constructions, such as yeast artificial 
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chromosomes, bacterial artificial chromosomes, bacterial P1 constructions 
or single chromosome cDNA libraries as reviewed in Price, C. M. (1993) 
Blood Rev. 7:127-134, and Trask, B. J. (1991) Trends Genet. 7:149-154. 
FISH (as described in Verma et al. (1 988) Human Chromosomes: A Manual 

5 of Basic Techniques, Pergamon Press, New York, N.Y.) may be correlated 
with other physical chromosome mapping techniques and genetic map 
data. Examples of genetic map data can be found in the 1994 Genome 
Issue of Science (265:1 981 f). Correlation between the location of the gene 
encoding a protein of the invention or a homologous protein on a physical 

10 chromosomal map and a specific disease, or predisposition to a specific 
disease, may help to delimit the region of DNA associated with that genetic 
disease. 

The nucleotide sequences of the subject invention may be used to detect 
15 differences in gene sequences between normal, carrier, or affected 
individuals. An analysis of polymorphisms, e.g. single nucleotide 
polymorphisms may be carried out. Further, in situ hybridization of 
chromosomal preparations and physical mapping techniques such as 
linkage analysis using established chromosomal markers may be used for 
20 extending genetic maps. Often the placement of a gene on the 
chromosome of another mammalian species, such as mouse, may reveal 
associated markers even if the number or arm of a particular human 
chromosome is not known. New sequences can be assigned to 
chromosomal arms, or parts thereof, by physical mapping. This provides 
25 valuable information to investigators searching for disease genes using 
positional cloning or other gene discovery techniques. Once the disease or 
syndrome has been crudely localized by genetic linkage to a particular 
genomic region, for example, AT to 11q22-23 (Gatti, R. A. et al. (1988) 
Nature 336:577-580), any sequences mapping to that area may represent 
30 associated or regulatory genes for further investigation. The nucleotide 
sequences of the subject invention may also be used to detect differences 
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in the chromosomal location due to translocation, inversion, etc. among 
normal, carrier, or affected individuals. 

In another embodiment of the invention, a protein of the invention or a 
homologous protein, its catalytic or immunogenic fragments or 
oligopeptides thereof, an in vitro model, a genetically altered cell or animal, 
can be used for screening libraries of compounds, e.g. peptides or 
low-molecular weight organic compounds, in any of a variety of drug 
screening techniques. One can identify modulators/effectors, e.g. 
receptors, enzymes, proteins, ligands, or substrates that bind to, modulate 
or mimic the action of one or more of the proteins of the invention. The 
protein or fragment employed in such screening may be free in solution, 
affixed to a solid support, borne on a cell surface, or located intracellular^. 
The formation of binding complexes, between a protein of the invention or 
a homologous protein and the agent tested, may be measured. Agents may 
also, either directly or indirectly, influence the activity of the proteins of 
the invention. 

In addition activity of the proteins of the invention against their 
physiological substrate(s) or derivatives thereof could be measured in 
cell-based assays. Agents may also interfere with posttranslational 
modifications of the proteins of the invention, such as phosphorylation and 
dephosphorylation, farnesylation, palmitoylation, acetylation, alkylation, 
ubiquitination, proteolytic processing, subcellular localization and 
degradation. Moreover, agents could influence the dimerization or 
oligomerization of the proteins of the invention or, in a heterologous 
manner, of the proteins of the invention with other proteins, for example, 
but not exclusively, docking proteins, enzymes, receptors, ion channels, 
uncoupling proteins, or translation factors. Agents could also act on the 
physical interaction of the proteins of this invention with other proteins, 
which are required for protein function, for example, but not exclusively, 
their downstream signaling. 
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The phosphatase activity of the Sac domain-containing inositol 
phosphatase 2 (SAC2) of the invention could be measured in vitro by using 
recombinantly expressed and purified SAC2 or fragments thereof by 
making use of artificial phosphatase substrates well known in the art, i.e. 
but not exclusively DiFMUP or FDP (Molecular Probes, Eugene, Oregon), 
which are converted to fluorophores or chromophores upon 
dephosphorylation. Alternatively, the dephosphorylation of physiological 
substrates of SAC2 could be measured by making use of any of the well 
known screening technologies suitable for the detection of the 
phosphorylation status of SAC2 inositol substrates, i.e. in a procedure 
similar as described for the inositol phosphatase SHIP2 (T. Habib et al. 
(1998), JBC 273, 18605-18609). In addition activity of SAC2 against its 
physiological substrate(s) or derivatives thereof could be measured in cell- 
based assays, thereby determining activity of the phosphatase at the level 
of their downstream signalling. 

Methods for determining protein-protein interaction are well known in the 
art. For example binding of a fluorescently labeled peptide derived from a 
protein of the invention to the interacting protein (or vice versa) could be 
detected by a change in polarisation. In case that both binding partners, 
which can be either the full length proteins as well as one binding partner 
as the full length protein and the other just represented as a peptide are 
fluorescently labeled, binding could be detected by fluorescence energy 
transfer (FRET) from one fluorophore to the other. In addition, a variety of 
commercially available assay principles suitable for detection of 
protein-protein interaction are well known In the art, for example but not 
exclusively AlphaScreen (PerkinElmer) or Scintillation Proximity Assays 
(SPA) by Amersham. Alternatively, the interaction of the proteins of the 
invention with cellular proteins could be the basis for a cell-based screening 
assay, in which both proteins are fluorescently labeled and interaction of 
both proteins is detected by analysing cotranslocation of both proteins with 
a cellular imaging reader, as has been developed for example, but not 
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exclusively, by Cellomics or EvotecOAI. In all cases the two or more 
binding partners can be different proteins with one being the protein of the 
invention, or in case of dimerization and/or oligomerization the protein of 
the invention itself. Proteins of the invention, for which one target 
mechanism of interest, but not the only one, would be such protein/protein 
interactions are CG7956, aralarl, how, CG9373, cpo, Jafrad, or 
CG 14440 homologous proteins. 

Assays for determining enzymatic and carrier activity of the proteins of the 
invention are well known in the art. Well known in the art are also a variety 
of assay formats to measure receptor-ligand binding. 

Of particular interest are screening assays for agents that have a low 
toxicity for mammalian cells. The term "agent" as used herein describes 
any molecule, e.g. protein or pharmaceutical, with the capability of altering 
or mimicking the physiological function of one or more of the proteins of 
the invention. Candidate agents encompass numerous chemical classes, 
though typically they are organic molecules, preferably small organic 
compounds having a molecular weight of more than 50 and less than 
about 2,500 Daltons. Candidate agents comprise functional groups 
necessary for structural interaction with proteins, particularly hydrogen 
bonding, and typically include at least an amine, carbonyl, hydroxyl or 
carboxyl group, preferably at least two of the functional chemical groups. 
The candidate agents often comprise carbocyclic or heterocyclic structures 
and/or aromatic or polyaromatic structures substituted with one or more of 
the above functional groups. 

Candidate agents are also found among biomolecules including peptides, 
saccharides, fatty acids, steroids, purines, pyrimidines, nucleic acids and 
derivatives, structural analogs or combinations thereof. Candidate ageints 
are obtained from a wide variety of sources including libraries of synthetic 
or natural compounds. For example, numerous means are available for 



WO 03/092715 



PCT/EP03/04650 



- 38 - 

random and directed synthesis of a wide variety of organic compounds and 
biomolecules, including expression of randomized oligonucleotides and 
oligopeptides. Alternatively, libraries of natural compounds in the form of 
bacterial, fungal, plant and animal extracts are available or readily 
produced. Additionally, natural or synthetically produced libraries and 
compounds are readily modified through conventional chemical, physical 
and biochemical means, and may be used to produce combinatorial 
libraries. Known pharmacological agents may be subjected to directed or 
random chemical modifications, such as acylation, alkylation, esterification, 
amidification, etc. to produce structural analogs. Where the screening 
assay is a binding assay, one or more of the molecules may be joined to a 
label, where the label can directly or indirectly provide a detectable signal. 

Another technique for drug screening, which may be used, provides for 
high throughput screening of compounds having suitable binding affinity to 
the protein of interest as described in published PCT application 
WO84/03564. In this method, as applied to a protein of the invention or a 
homologous protein, large numbers of different small test compounds are 
synthesized on a solid substrate, such as plastic pins or some other 
surface. The test compounds are reacted with the proteins, or fragments 
thereof, and washed. Bound proteins are then detected by methods well 
known in the art. Purified proteins can also be coated directly onto plates 
for use in the aforementioned drug screening techniques. Alternatively, 
non-neutralizing antibodies can be used to capture the peptide and 
immobilize it on a solid support. In another embodiment, one may use 
competitive drug screening assays in which neutralizing antibodies capable 
of binding a protein of the invention specifically compete with a test 
compound for binding the protein. In this manner, the antibodies can be 
used to detect the presence of any peptide, which shares one or more 
antigenic determinants with the protein of the invention. 
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The nucleic acids encoding the proteins of the invention can be used to 
generate transgenic cell lines and animals. These transgenic non-human 
animals are useful in the study of the function and regulation of the 
proteins of the invention in vivo. Transgenic animals, particularly 
mammalian transgenic animals, can serve as a model system for the 
investigation of many developmental and cellular processes common to 
humans. A variety of non-human models of metabolic disorders can be 
used to test modulators of the protein of the invention. Misexpression (for 
example, overexpression or lack of expression) of the protein of the 
invention, particular feeding conditions, and/or administration of 
biologically active compounts can create models of metablic disorders. 

In one embodiment of the invention, such assays use mouse models of 
insulin resistance and/or diabetes, such as mice carrying gene knockouts in 
the leptin pathway (for example, ob (leptin) or db (leptin receptor) mice). 
Such mice develop typical symptoms of diabetes , show hepatic lipid 
accumulation and frequently have increased plasma lipid levels (see 
Bruning et al, 1998, Mol. Cell. 2:449-569). Susceptible wild type mice (for 
example C57BI/6) show similiar symptoms if fed a high fat diet. In addition 
to testing the expression of the proteins of the invention in such mouse 
strains (see EXAMPLES section), these mice could be used to test whether 
administration of a candidate modulator alters for example lipid 
accumulation in the liver, in plasma, or adipose tissues using standard 
assays well known in the art, such as FPLC, colorimetric assays, blood 
glucose level tests, insulin tolerance tests and others. 

Transgenic animals may be made through homologous recombination in 
non-human embryonic stem cells, where the normal locus of the gene 
encoding the protein of the invention is mutated. Alternatively, a nucleic 
acid construct encoding the protein is injected into oocytes and is 
randomly integrated into the genome. One may also express the genes of 
the invention or variants thereof in tissues where they are not normally 
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expressed or at abnormal times of development. Furthermore, variants of 
the genes of the invention like specific constructs expressing anti-sense 
molecules or expression of dominant negative mutations, which will block 
or alter the expression of the proteins of the invention may be randomly 
integrated into the genome. A detectable marker, such as lac Z or 
luciferase may be introduced into the locus of the genes of the invention, 
where upregulation of expression of the genes of the invention will result 
in an easily detectable change in phenotype. Vectors for stable integration 
include plasmids, retroviruses and other animal viruses, yeast artificial 
chromosomes (YACs), and the like. 

DNA constructs for homologous recombination will contain at least 
portions of the genes of the invention with the desired genetic 
modification, and will include regions of homology to the target locus. 
Conveniently, markers for positive and negative selection are included. 
DNA constructs for random integration do not need to contain regions of 
homology to mediate recombination. DNA constructs for random 
integration will consist of the nucleic acids encoding the proteins of the 
invention, a regulatory element (promoter), an intron and a poly-adenylation 
signal. Methods for generating cells having targeted gene modifications 
through homologous recombination are known in the field. For non-human 
embryonic stem (ES) cells, an ES cell line may be employed, or embryonic 
cells may be obtained freshly from a host, e.g. mouse, rat, guinea pig, etc. 
Such cells are grown on an appropriate fibroblast-feeder layer and are 
grown in the presence of leukemia inhibiting factor (LIF). 

When non-human ES or non-human embryonic cells or somatic pluripotent 
stem cells have been transformed, they may be used to produce transgenic 
animals. After transformation, the cells are plated onto a feeder layer in an 
appropriate medium. Cells containing the construct may be selected by 
employing a selective medium. After sufficient time for colonies to grow, 
they are picked and analyzed for the occurrence of homologous 
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recombination or integration of the construct. Those colonies that are 
positive may then be used for embryo transfection and blastocyst injection. 
Blastocysts are obtained from 4 to 6 week old superovulated females. The 
ES cells are trypsinized, and the modified cells are injected into the 
blastocoel of the blastocyst. After injection, the blastocysts are returned to 
each uterine horn of pseudopregnant females. Females are then allowed to 
go to term and the resulting offspring is screened for the construct. By 
providing for a different phenotype of the blastocyst and the genetically 
modified cells, chimeric progeny can be readily detected. The chimeric 
animals are screened for the presence of the modified gene and males and 
females having the modification are mated to produce homozygous 
progeny. If the gene alterations cause lethality at some point in 
development, tissues or organs can be maintained as allogenic or congenic 
grafts or transplants, or in vitro culture. The transgenic animals may be any 
non-human mammal, such as laboratory animal, domestic animals, etc. The 
transgenic animals may be used in functional studies, drug screening, etc. 

Finally, the invention also relates to a kit comprising at least one of 

(a) a CG7956, aralarl, how, CG9373, cpo, Jafrad, or CGI 4440 
homologous nucleic acid molecule or a functional fragment thereof; 

(b) a CG7956, aralarl, how, CG9373, cpo, Jafrad, or CG 14440 
homologous amino acid molecule or a functional fragment or an 
isoform thereof; 

(c) a vector comprising the nucleic acid of (a); 

(d) a host cell comprising the nucleic acid of (a) or the vector of (c); 

(e) a polypeptide encoded by the nucleic acid of (a); 

(f) a fusion polypeptide encoded by the nucleic acid of (a); 

(g) an antibody, an aptamer or another effector/modulator against the 
nucleic acid of (a) or the polypeptide of (b), (e), or (f) and 

(h) an anti-sense oligonucleotide of the nucleic acid of (a). 



WO 03/092715 



PC1YEP03/04650 



- 42 - 

The kit may be used for diagnostic or therapeutic purposes or for screening 
applications as described above. The kit may further contain user 
instructions. 

5 The Figures show: 

Figure 1 shows the triglyceride content of a Drosophila Gadfly Accession 
Number CG7956 mutant. Shown is the change of triglyceride content of 
HD-EP(3)31805 flies caused by integration of the P-vector 3 base pairs 5' 
10 of the CG7956 transcription unit (referred to as 'HD-EP31 805', column 2) 
in comparison to controls containing all flies of the EP collection (referred 
to as 'EP-control', column 1). 

Figure 2 shows the molecular organization of the mutated CG7956 (Gadfly 
15 Accession Number) gene locus. 

Figure 3 shows the BLASTP search result for the Gadfly Accession Number 
CG7956 gene product (Query) with the best human homologous match 
(Sbjct). 

20 

Figure 4 shows the expression of the CG7956 homolog in mammalian 
tissues. 

Figure 4A shows the real-time PCR analysis of Sac domain-containing 
inositol phosphatase 2 (SAC2) expression in wild-type mouse tissues. 
25 Figure 4B shows the real-time PCR analysis of SAC2 expression in different 
mouse models. 

Figure 5 shows the triglyceride content of a Drosophila aralar 1 (Gadfly 
Accession Number CG2139) mutant. Shown is the change of triglyceride 
30 content of EP(3)3675 flies caused by integration of the P-vector into an 
intron of the CG2139 gene (referred to as / EP(3)3675' / column 2) in 
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comparison to controls containing all flies of the EP collection (referred to 
as 'EP-control', column 1). 

Figure 6 shows the molecular organization of the mutated aralar 1 (Gadfly 
Accession Number CG2139) gene locus. 

Figure 7 shows the homology of Drosophila aralar 1 to human solute 
carrier family 25, members 1 1 and 12. 

Figure 7A shows the BLASTP search results for the aralar 1 gene product 
(Query) with the two best human homologous matches (Sbjct). 
Figure 7B shows the comparison of human and Drosophila proteins, 
'aralarl Dm' refers to Drosophila protein encoded by aralar 1, 'SLC25A12 
Hs' refers to human solute carrier family 25, member 1 2, and 'SLC25A1 3 
Hs' refers to human solute carrier family 25, member 13. 

Figure 8 shows the expression of the aralar 1 homologs in mammalian 
tissues. 

Figure 8A shows the real-time PCR analysis of solute carrier family 25, 
member 12 (Slc25a12) expression in wild-type mouse tissues. 
Figure 8B shows the real-time PCR analysis of Slc25a12 expression in 
different mouse models. 

Figure 8C shows the real-time PCR analysis of solute carrier family 25, 
member 13 (Slc25a13) expression in wild-type mouse tissues. 
Figure 8D shows the real-time PCR analysis of Slc25a13 expression in 
different mouse models. 

Figure 9 shows the triglyceride content of a Drosophila how (Gadfly 
Accession Number CG 10293) mutant. Shown is the change of triglyceride 
content of HD-EP{3)3081 5 flies caused by integration of the P-vector into 
the promoter of the how gene (referred to as 'HD-EP3081 5', column 2) in 
comparison to controls containing all flies of the EP collection (referred to 
as 'EP-control', column 1). 
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Figure 10 shows the molecular organization of the mutated how (Gadfly 
Accession Number CG10293) gene locus. 

Figure 1 1 shows the homology of Drosophila how (GadFly Accession 
Number CG 10293) to the human quaking isoforms. 

Figure 11A shows the BLASTP search result for the how gene product 
(Query) with the twelve best human homologous matches (Sbjct). 
Figure 11B shows the comparison of human and Drosophila proteins. 
'CG10293 Dm' refers to Drosophila protein encoded by CG10293, 'QKl-6 
Hs' refers to human QUAKING isoform 6, 'QKI-2 Hs' refers to human 
QUAKING isoform 2, 'QKI-3 Hs' refers to human QUAKING isoform 3, and 
'HQK-7B Hs' refers to human RNA binding protein HQK-7B. 

Figure 1 2 shows the expression of how homologs in mammalian (human) 
tissue. 

Figure 12A shows the quantitative analysis of QUAKING 6 expression in 
human abdominal adipocyte cells, during the differentiation from 
preadipocytes to mature adipocytes. 

Figure 1 2B shows the quantitative analysis of RNA binding protein HQK-7B 
expression in human abdominal adipocyte cells, during the differentiation 
from preadipocytes to mature adipocytes. 

Figure 1 3 shows the triglyceride content of a Drosophila Gadfly Accession 
Number CG9373 mutant. Shown is the change of triglyceride content of 
HD-EP(3)31646 flies caused by ectopic expression of the CG9373 gene 
mainly in the neurons of these flies (referred to as 'HD-EP3646/elav', 
column 2) in comparison to controls with integration of this vector (referred 
to as 'random EP/elav', column 1). 

Figure 14 shows the molecular organization of the mutated CG9373 
(Gadfly Accession Number) gene locus. 
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Figure 15 shows the homology of Drosophila GadFly Accession Number 
CG9373 to human K1AA1443 protein, unnamed protein product, and 
myelin gene expression factor 2. 

Figure 1 5A shows the BLASTP search result for the CG9373 gene product 
5 (Query) with the three best human homologous matches (Sbjct). 

Figure 15B shows the comparison of human and Drosophila proteins. 
'CG9373 Dm' refers to Drosophila protein encoded by CG9373, 
'KIAA1341 Hs' refers to human KIAA1341 protein, 'MyEF-2 Hs' refers to 
human myelin gene expression factor 2, and 'FLJ13071 Hs' refers to 
10 human unnamed protein product FLJ13071. 

Figure 16 shows the expression of the CG9373 homolog in mammalian 
tissues. 

Figure 16A shows the real-time PCR analysis of myelin gene expression 
15 factor 2 (MEF-2) expression in wild-type mouse tissues. 

Figure 16B shows the real-time PCR analysis of MEF-2 expression in 
different mouse models. 

Figure 16C shows the real-time PCR analysis of MEF-2 expression in mice 
fed with a high fat diet compared to mice fed with a standard diet. 

20 

Figure 17 shows the triglyceride content of a Drosophila cpo (Gadfly 
Accession Number CG 18434) mutant. Shown is the change of triglyceride 
content of EP{3)0661 flies caused by integration of the P-vector into the 
promoter of the CG18434 gene (referred to as 'EP(3)0661/Tm3,Sb' 
25 column 2) in comparison to controls containing all flies of the EP collection 
(referred to as 'EP-control', column 1). 

Figure 1 8 shows the molecular organization of the mutated cpo (Gadfly 
Accession Number CG18434) gene locus. 

30 



Figure 19 shows the homology of Drosophila cpo to human RNA binding 
proteins with multiple splicing. 
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Figure 1 9A shows the comparison of human and Drosophila proteins, 'cpo 
Dm' refers to Drosophila protein encoded by cpo, 'NP_006858 Hs' refers 
to human RNA binding protein with multiple splicing (RBPMS), and 
'IPI001611' refers to human RNA binding with multiple splicing (RBPMS) 
family member. 

Figure 19B shows the amino acid sequence encoded by Drosophila cpo 
gene (GadFly Accession Number CG31243, SEQ ID NO:1). 

Figure 20 shows the quantitative analysis of RNA binding protein with 
multiple splicing (RBPMS) expression in human abdominal adipocyte cells, 
during the differentiation from preadipocytes to mature adipocytes. 

Figure 21 shows the triglyceride content of a Drosophila Jafrad (Gadfly 
Accession Number CG1633) mutant. Shown is the change of triglyceride 
content of PX9430.2 flies caused by integration of the P-vector into the 
leader of the Jafrad gene (referred to as 'PX 9430. 2', column 2) in 
comparison to controls without integration of this vector, (herein referred 
to as 'PX-control', column 1). 

Figure 22 shows the molecular organization of the mutated Jafrad (Gadfly 
Accession Number CG1633) gene locus. 

Figure 23 shows the homology of Drosophila Jafrad (GadFly Accession 
Number CG1633) to human peroxiredoxin 1 and 2. 

Figure 23A shows the BLASTP search result for the Jafrad gene product 
(Query) with the best two human homologous matches (Sbjct). 
Figure 23B shows the comparison of human and Drosophila proteins. 
'Jafrad Dm' refers to Drosophila protein encoded by Jafrad , 'PRDX1 Hs' 
refers to human peroxiredoxin 1, and 'PRDX2 Hs' refers to human 
peroxiredoxin 2. 
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Figure 24 shows the quantitative analysis of peroxiredoxin 1 (PRDX1) 
expression in human abdominal adipocyte cells, during the differentiation 
from preadipocytes to mature adipocytes. 

Figure 25 shows the triglyceride content of a Drosophila Gadfly Accession 
Number CG 14440 mutant. Shown is the change of triglyceride content of 
PX10162.1 flies caused by integration of the P-vector upstream of the 
CG 14440 gene (referred to as 'PX10162.1', column 2) in comparison to 
controls without integration of this vector, (herein referred to as 
'PX-control', column 1). 

Figure 26 shows the molecular organization of the mutated CG14440 
(Gadfly Accession Number) gene locus. 

Figure 27 shows the BLASTP search result for the CG 14440 gene product 
(Query) with the best human homologous match (Sbjct). 

Figure 28 shows the quantitative analysis of hypothetical protein 
LOC55565 expression in human abdominal adipocyte cells, during the 
differentiation from preadipocytes to mature adipocytes. 

The examples illustrate the invention: 

Example 1 : Measurement of triglyceride content in Drosophila 

Mutant flies are obtained from proprietary and publicly available fly 
mutation stock collections. The flies are grown under standard conditions 
known to those skilled in the art. In the course of the experiment, 
additional feedings with bakers yeast (Saccharomyces cerevisiae) are 
provided. The average change of triglyceride content of Drosophila 
containing the EP-vectors in homozygous or heterozygous viable 
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integration was investigated in comparison to control flies (see Figures 1, 
5, 9, 13, and 17, 21, and 25). For determination of triglyceride, flies were 
incubated for 5 min at 90°C (in case of PX9430.2 and PX10162.1 at 
70 °C) in an aqueous buffer using a waterbath, followed by hot extraction. 
After another 5 min incubation at 90°C (in case of PX9430.2 and 
PX10162.1 at 70°C) and mild centrifugation, the triglyceride content of 
the flies extract was determined using Sigma Triglyceride (INT 336-10 or 
-20) assay by measuring changes in the optical density according to the 
manufacturer's protocol. As a reference protein content of the same 
extract was measured using BIO-RAD DC Protein Assay according to the 
manufacturer's protocol for the EP-lines. The assays were repeated several 
times. 

The average triglyceride level of all flies of the EP collections (referred to as 
'EP-control') is shown as 100% in the first columns in Figures 1, 5, 9, and 
1 7, respectively. The average triglyceride level of about 50 lines of the PX 
collection (referred to as 'PX-control') is shown as 100% in the first 
column in Figures 21 and 25 (relative amount of triglyceride per fly). The 
average triglyceride level of all flies containing the elav- Gal4 vector 
(referred to as 'random EP/elav') is shown as 100% in the first column in 
Figure 13. Standard deviations of the measurements are shown as thin 
bars. 

HD-EP(3)31805 homozygous flies (column 2 in Figure 1), EP(3)0661 
heterozygous flies (column 2 in Figure 17, referred to as 
'EP(3)0661/TM3,Sb'), PX9430.2 homozygous flies (column 2 in Figure 
21), and PX10162.1 homozygous flies (column 2 in Figure 25) show 
constantly a higher triglyceride content than the controls. EP(3)3675 
homozygous flies (column 2 in Figure 5) and HD-EP(3)3081 5 homozygous 
flies (column 2 in Figure 9) show constantly a lower triglyceride content 
than the controls. Therefore, the loss of gene activity in the loci where the 
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EP-vectors or PX-vectors are viably integrated, is responsible for changes 
in the metabolism of the energy storage triglycerides. 

HD-EP(3)31 646 males were crossed to elav-Gal4 virgins. The offspring 
carries a copy of the HD-EP{3)31 646 vector and a copy of the elav-Gal4 
vector, leading to ectopic expression of adjacent genomic DNA sequences 
3prime of the HD-EP(3)31 646 integration locus, mainly in the neurons of 
these flies. The flies were analyzed in an assay measuring the triglyceride 
content of these flies. The result of the triglyceride content analysis is 
shown in Figure 13. HD-EP(3)31 646/elav flies show constantly a higher 
triglyceride content (column 2 in Figure 13) than the control EP-collection 
that is crossed to elav-Gal4 (referred to as 'random EP/elav', column 1 in 
Figure 13). Therefore, the gain of gene activity in the locus, where the 
EP-vector of HD-EP(3)31 646 flies is integrated in the promoter of the 
CG9373 gene, is responsible for changes in the metabolism of the energy 
storage triglycerides. 

Example 2: Identification of Drosophila genes associated with regulation of 
metabolism 

Nucleic acids encoding the proteins of the present invention were identified 
using a plasmid-rescue technique. Genomic DNA sequences were isolated 
that are localized adjacent to the EP vector (herein HD-EP(3)31 805, 
EP(3)3675, HD-EP(3)30815, HD-EP(3)31 646, EP(3)0661, PX9430.2, or 
PX10162.1) integration. Using those isolated genomic sequences public 
databases like Berkeley Drosophila Genome Project (GadFly) were screened 
thereby identifying the integration sites of the vectors, and the 
corresponding genes. The molecular organization of these gene loci is 
shown in Figures 2, 6, 10, 14, 18, 22, and 26. 
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ln Figures 2, 10, 14, and 26, genomic DNA sequence is represented by the 
assembly as a dotted black line in the middle that includes the integration 
sites of the vectors for lines HD-EP(3)31 805, HD-EP(3)3081 5, 
HD-EP(3)31646, or PX10162.1. Numbers represent the coordinates of the 
genomic DNA. The upper parts of the figures represent the sense strand 
" + ", the lower parts represent the antisense strand "-". The insertion sites 
of the P-elements in the Drosophila lines are shown as triangles or boxes in 
the "P-elements + ", "P-elements -", or middle lines. Transcribed DNA 
sequences (ESTs) are shown as grey bars in the "EST + " and/or the "EST 
lines, and predicted cDNAs are shown as bars in the "cDNA +" and/ or 
"cDNA -" lines. Predicted exons of the cDNAs are shown as dark grey bars 
and introns are shown as light grey bars. 

In Figures 6, 18, and 22, genomic DNA sequence is represented by the 
assembly as a thin black scaled double-headed arrow in the middle that 
includes the integration sites of the vectors for lines EP(3)3675, 
EP(3)0661, or PX9430.2. Numbers and ticks represent the length of the 
genomic DNA (1000 base pairs per tick in Figure 6, 10000 base pairs per 
tick in Figures 18 and 22). The upper part of the figure represents the 
sense strand, the lower part represent the antisense strand. The grey 
arrows in the upper part of Figures 6 and 22, and the dark grey box in the 
topmost part of Figure 1 8 represent BAC clones, the black arrows in the 
topmost part of Figures 6 and 22, and the light grey box in the middle of 
Figure 18 represent the sections of the chromosomes or GenBank units. 
The insertion sites of the P-elements in the Drosophila lines are shown as 
grey triangles in Figures 6 and 18, and as black vertical line in Figure 22. 
The P-insertion sites are labeled. Grey bars, linked by black lines represent 
cDNA sequences. Predicted genes are shown as black bars (exons), linked 
by black lines (Figures 6 and 22) or light grey serrated lines (Figure 1 8) 
(introns), and are labeled (see also key at the bottom of the figures). 
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The HD-EP(3)31 805 vector is homozygous viable integrated 3 base pairs 5' 
of a Drosophila gene in antisense orientation, identified as GadFly 
Accession Number CG7956. The chromosomal localization site of the 
integration of the vector of HD-EP(3)31805 is at gene locus 3R, 93E4. In 
Figure 2, the coordinates of the genomic DNA are starting at position 
17260000 on chromosome 3R, ending at position 17270000. The 
insertion site of the P-element in Drosophila HD-EP(3)31805 line is shown 
in the "P Elements line and is labeled. The predicted cDNA of the 
CG7956 gene is shown in the "cDNA +" line and is labeled. 

The EP(3)3675 vector is homozygous viable integrated into an intron of a 
Drosophila gene in sense orientation, identified as aralarl (GadFly 
Accession Number CG2139). The chromosomal localization site of the 
integration of the vector of EP(3)3675 is at gene locus 3R, 99F6. In Figure 
6, the insertion site of the P-element in Drosophila EP(3)3675 line is shown 
in the as triangle in the lower part of the figure and labeled with an arrow. 
The predicted transcription variants of the Drosophila aralarl gene (GadFly 
Accession Number CG2139) are shown as black boxes, linked with thin 
black lines. 

The HD-EP(3)3081 5 vector is homozygous viable integrated into the 
promoter of a Drosophila gene in antisense orientation, identified as how 
(GadFly Accession Number CG 10293). The chromosomal localization site 
of the integration of the vector of HD-EP(3)3081 5 is at gene locus 3R, 
94A1-2. In Figure 10, the coordinates of the genomic DNA are starting at 
position 17775577 on chromosome 3R, ending at position 17775577. The 
insertion site of the P-element in Drosophila HD-EP(3)3081 5 line is shown 
in the "P-elements -" line. The predicted cDNA of the how gene is shown 
in the "cDNA +" line and is labeled. 

The HD-EP(3)31646 vector is homozygous viable integrated into the 
promoter region of a Drosophila gene in sense orientation, identified as 
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GadFIy Accession Number CG9373. The chromosomal localization site of 
the integration of the vector of HD-EP(3)31 646 is at gene locus 3R, 
85D25. In Figure 14, the coordinates of the genomic DNA are starting at 
position 5312505 on chromosome 3R, ending at position 5318755. The 
insertion site of the P-element in Drosophila HD-EP(3)31 646 line is shown 
in the "P-elements line. The predicted cDNA of the CG9373 gene is 
shown in the "cDNA line and is labeled. 

The EP(3)0661) vector is homozygous lethal / heterozygous viable 
integrated into the promoter of RE30936.5 in sense orientation, 
representing an EST-clone of a Drosophila gene, identified as cpo (GadFIy 
Accession Numbers CG18434 and CG31243). The chromosomal 
localization site of the integration of the vector of EP(3)0661 is at gene 
locus 3R, 90D1. In Figure 18, the insertion site of the P-element in 
Drosophila EP(3)0661 line is shown as triangle in the upper part of the 
figure and labeled with an arrow. The predicted cDNA of the cpo gene is 
shown in the upper part of the figure and is labeled. 

The PX9430.2 vector is homozygous viable integrated into the leader 
sequence of a Drosophila gene, identified as Jafrad (GadFIy Accession 
Number CGI 633). The chromosomal localization site of the integration of 
the vector of PX9430.2 is at gene locus X, 11E6. In Figure 22, the 
insertion site of the P-element in Drosophila PX9430.2 line is shown as 
vertical labeled line. The predicted transcript variants of the Drosophila 
Jafrad gene are shown in the upper part of the figure and are labeled. 

The PX10162.1 vector is homozygous viable integrated upstream of the 
5'-end of a Drosophila gene, identified as GadFIy Accession Number 
CG 14440. The chromosomal localization site of the integration of the 
vector of PX10162.1 is at gene locus X, 6C7. In Figure 26, the 
coordinates of the genomic DNA are starting at position 6494082 on 
chromosome X, ending at position 6519082. The insertion site of the 
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P-element in Drosophila PX10162.1 line is shown as " + " on the dotted 
middle line. The predicted cDNA of CG 14440 shown in the "cDNA line 
and is labeled, the corresponding EST is shown in the "EST -"line and is 
labeled. 

Expression of the genes described above could be affected by integration 
of the vectors into the transcription units, leading to a change in the 
amount of the energy storage triglycerides. 

Example 3: Identification of human homologous genes and proteins 

The Drosophila genes and proteins encoded thereby with functions in the 
regulation of triglyceride metabolism were further analysed using the 
BLAST algorithm searching in publicly available sequence databases and 
mammalian homologs were identified (see Table 1 and Figures 3, 7, 1 1 , 
15, 19, 23, and 27). 
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Table 1 : Human homologs of the Drosophila (Dm) genes 



Dm gene 


Homo sapiens homologous genes and proteins 


Acc. No. 


Accession Number 


Name 


Name 


cDNA 


Protein 




CG7956 


NM_014937 


NP_055752 


Sac domain-containing inositol 
phosphatase 2 (SAC2); KIAA0966 


CG2139 
aralarl 


NM_003705 


NP_003696 


solute carrier family 25 (mitochondrial 
carrier. AralarV member 12 (SLC25A12^ 




NM_014251 


NP_055066 


solute carrier family 25, member 13 
fcitrin) (SLC25A13} 


CG10293 
how 


AF142419 


AAF63414 


QUAKING isoform 6 (QUAKING) 




AF142418 


AAF63413 


QUAKING isoform 2 (QUAKING) 




AF 142422 


AAF63417 


QUAKING isoform 3 (QUAKING) 




AB067801 


BAB69499 


RNA binding protein HQK-7B 


CG9373 


AB037762 


BAA92579 


KIAA1341 protein 




AK023133 

■X UkWM X *J mj 


BAB 14421 


unnamed protein product FLI13071 ! 




NM_016132 


NP_057216 


myelin gene expression factor 2 (MEF-2) 


CG31243 
CG18434 


NM_006867 


NP_006858 


RNA binding protein with multiple 
splicing (RBPMS) 


cpo 


ENSG00000 
166831 


ENSPO000O 
300069 


RNA binding with multiple splicing 
(RBPMS) family member 


CG1633 
Jafracl 


NM_002574 


NP_002565 


peroxiredoxin 1 (PRDX1) 




BC000452 


AAH00452 


protein similar to thioredoxin peroxidase 
1 


CG14440 


NM_017530 


NP_060000 


hypothetical protein LOC55565 
(LOC55565) 



CG7956, aralarl, how, CG9373, cpo, Jafrad, or CG14440 homologous 
proteins and nucleic acid molecules coding therefore are obtainable from 
insect or vertebrate species, e.g. mammals or birds. Particularly preferred 
are nucleic acids as described in Table 1 . 



The present invention is describing polypeptides comprising the amino acid 
sequences of the proteins of the invention. Comparisons (Clustal W 1 .83 
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analysis, see for example Thompson J. D. et al., (1 994) Nucleic Acids Res. 
22(22) :4673-4680; Thompson J. D., (1997) Nucleic Acids Res 
25(24) :4876-4882; Higgins, D. G. et al., (1996) Methods Enzymol. 
266:383-402) between the respective proteins of different species (human 
and Drosophila) were conducted. Gaps in the alignment are represented as 
-. Based upon homology, the Drosophila proteins of the invention and each 
homologous protein or peptide may share at least some activity. 

As shown in Figure 3, gene product of Drosophila GadFly Accession 
Number CG7956 is 52% homologous to human Sac domain-containing 
inositol phosphatase (SAC2, also referred to as KIAA0966 protein; 
GenBank Accession Number NP_055752. 1 for the protein, NM_01 4937 for 
the cDNA). CG7956 also shows homology to mouse protein 
ENSMUSP00000045910 (ENSEMBL Accession Number). 

Human solute carrier family 25 (mitochondrial carrier, Aralar), member 12 
is also referred to as GenBank Accession Number XP_01 0876.3 for the 
protein, XM 010876 for the cDNA. As shown in Figure 7A, the gene 
product of Drosophila aralar 1 is 74% homologous to human solute carrier 
family 25 (mitochondrial carrier, Aralar), member 12 and 73% homologous 
to human solute carrier family 25, member 13 (citrin). aralar 1 also shows 
homology to mouse solute carrier family 25 (mitochondrial carrier; adenine 
nucleotide translocator), member 13 (GenBank Accession Number 
NPJD56644.1). 

As shown in Figure 11 A, gene product of Drosophila how is 64% 
homologous to human QUAKING isoform 5 (GenBank Accession Number 
AAF63416.1 for the protein, AF1 42421 for the cDNA), 64% homologous 
to human protein similar to KH domain RNA binding protein QKI-5A 
(GenBank Accession Number XP_037438.2 for the protein, XM 037438 
for the cDNA), 64% homologous to QUAKING isoform 6 (GenBank 
Accession Number AAF6341 4.1 for the protein, AF1 4241 9 for the cDNA), 
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64% homologous to unnamed protein product (GenBank Accession 
Number BAB55032.1 for the protein, AK027309 for the cDNA), 67% 
homologous to QUAKING isoform 2 (GenBank Accession Number 
AAF63413.1 for the protein, AF1 4241 8 for the cDNA), 67% homologous 
to QUAKING isoform 3 (GenBank Accession Number AAF63417.1 for the 
protein, AF1 42422 for the cDNA), 67% homologous to QUAKING isoform 
4 (GenBank Accession Number AAF6341 5.1 for the protein, AF1 42420 for 
the cDNA), 67% homologous to QUAKING isoform 3 (GenBank Accession 
Number AAF63417.1 for the protein, AF1 42422 for the cDNA), 67% 
homologous to RNA binding protein HQK-6 (GenBank Accession Number 
BAB69497.1 for the protein, AB067799 for the cDNA), 67% homologous 
to RNA binding protein HQK-7B (GenBank Accession Number BAB69499.1 
for the protein, AB067801 for the cDNA), 67% homologous to RNA 
binding protein HQK-7 (GenBank Accession Number BAB69498.1 for the 
protein, AB067800 for the cDNA), 67% homologous to QUAKING isoform 
1 (GenBank Accession Number AAF6341 2.1 for the protein, AF1 4241 7 for 
the cDNA), and 64% to genes related to stomach cancer (GenBank 
Accession Number BD004960.1 . Drosophila how also shows homology to 
mouse KH domain RNA binding protein QKI-7B (GenBank Accession 
Number AAC63042.1). 

As shown in Figure 15A, gene product of Drosophila GadFly Accession 
Number CG9373 is 44% homologous to human KIAA1341 protein 
(GenBank Accession Number BAA92579.1 for the protein, AB037762 for 
the cDNA), 43% homologous to human unnamed protein product 
(GenBank Accession Number BAB14421 .1 for the protein, AK023133 for 
the cDNA), and 43% to myelin gene expression factor 2 (GenBank 
Accession Number NP 057216.1 for the protein, NM_016132 for the 
cDNA. CG9373 also shows homology to mouse myelin gene expression 
factor (GenBank Accession Number AAL90778.1). 
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Drosophila cpo is also referred to as SEQ ID NO:1 in Figure 19B. Human 
RNA-binding protein gene with multiple splicing (RBPMS) is also referred to 
as GenBank Accession Number XP047075.1 for the protein, XM_047075 
for the cDNA, and human gene similar to RNA-binding protein with multiple 
splicing is also referred to as GenBank Accession Number XP_091097 for 
the protein, XM_091097 for the cDNA. As shown in Figure 1 9A, the gene 
product of Drosophila CG31243 is 62% homologous to human 
RNA-binding protein with multiple splicing and 59% homologous to human 
protein similar to RNA-binding protein with multiple splicing at the 
C-terminal part, respectively. 

As shown in Figure 23A, gene product of Drosophila Jafrad is 83% 
homologous to human peroxiredoxin 2 (GenBank Accession Number 
XP_009063.2 for the protein, XMJD09062 for the cDNA) and 82% 
homologous to human peroxiredoxin 1 (GenBank Accession Number 
NP_002565.1 for the protein, NM_002574 for the cDNA). CG1633 also 
shows homology to mouse thioredoxin dependent peroxide reductase 2 
(GenBank Accession Number NP_035 1 64. 1 ) and to mouse peroxiredoxin 4 
(GenBank Accession Number NPJ348044.1 ). 

As shown in Figure 27, gene product of Drosophila GadFly Accession 
Number CGI 4440 is 57% homologous to human hypothetical protein 
LOC55565 (GenBank Accession Number NP_060000.1 for the protein, 
NMJ317530 for the cDNA). CG 14440 also shows homology to mouse 
protein similar to hypothetical protein LOC55565 (GenBank Accession 
Number AAH23180.1). 

The human Jafrad homologous protein peroxiredoxin 1 is also referred to 
as natural killer cell enhancing factor A in Patent Number US5610286-A. 
The human Jafrad homologous protein peroxiredoxin 2 is also referred to 
as amino acid sequence of the acid form of peroxyredoxin TDX1 in Patent 
Number FR2798672-A1 . The human CG14440 homologous protein is also 
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referred to as human polypeptide SEQ ID NO 3381 in Patent Number 
WO200153312-A1. 

Example 4: Genetic adipose pathway screen 

Adipose (adp) is a protein that has been described as regulating, causing or 
contributing to obesity in an animal or human (see WO 01/96371). 
Transgenic flies containing a wild type copy of the adipose cDNA under the 
control of the Gal4/UAS system were generated (Brand and Perrimon, 
1 993, Development 1 1 8:401 ^41 5; for adipose cDNA, see WO 01 /96371 ). 
Chromosomal recombination of these transgenic flies with an eyeless-Gal4 
driver line has been used to generate a stable recombinant fly line 
over-expressing adipose in the developing Drosophila eye. Animals 
receiving transgenic adipose activity under these conditions developed into 
adult flies with a visible change of eye phenotype. Virgins of the 
recombinant driver line were crossed with males of the mutant EP-line 
collection in single crosses and kept for preferably 1 2 to 15 days at 29 °C. 
The offspring was checked for modifications of the eye phenotype 
(enhancement or suppression). Mutations changing the eye phenotype 
affect genes that modify adipose activity. The inventors have found that 
the fly line HD-EP(3)3571 5 is a suppressor of the eye-adp-Gal4 induced 
eye phenotype. This result is strongly suggesting an interaction of the cpo 
gene with adipose since the integration of HD-EP(3)3571 5 was found to be 
located at the cpo locus. This is supporting the function of cpo and 
homologous proteins in the regulation of the energy homeostasis. 

Example 5: dUCPy modifier screen 

Expression of Drosophila uncoupling protein dUCPy in a non-vital organ like 
the eye (Gal4 under control of the eye-specific promoter of the "eyeless" 
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gene) results in flies with visibly damaged eyes. This easily visible eye 
phenotype is the basis of a genetic screen for gene products that can 
modify UCP activity. 

5 Parts of the genomes of the strain with Gal4 expression in the eye and the 
strain carrying the pUAST-dUCPy construct were combined on one 
chromosome using genomic recombination. The resulting fly strain has 
eyes that are permanently damaged by dUCPy expression. Flies of this 
strain were crossed with flies of a large collection of mutagenized fly 

10 strains. In this mutant collection a special expression system (EP-element, 
. Ref.: Rorth P, Proc Natl Acad Sci USA 1996, 93(22) :1 241 8-22) is 
integrated randomly in different genomic loci. The yeast transcription factor 
Gal4 can bind to the EP-element and activate the transcription of 
endogenous genes close the integration site of the EP-element. The 

15 activation of the genes therefore occurs in the same cells (eye) that 
overexpress dUCPy. Since the mutant collection contains several thousand 
strains with different integration sites of the EP-element it is possible to 
test a large number of genes whether their expression interacts with 

dUCPy activity. In case a gene acts as an enhancer of UCP activity the eye 

* 

20 defect will be worsened; a suppressor will ameliorate the defect. 

Using this screen a gene with suppressing activity was discovered that 
was found to be the cpo gene in Drosophila. 

25 

Example 6: Expression of the polypeptides in mammalian (mouse) tissues 

For analyzing the expression of the polypeptides disclosed in this invention 
in mammalian tissues, several mouse strains (preferrably mice strains 
30 C57BI/6J, C57BI/6 ob/ob and C57BI/KS db/db which are standard model 
systems in obesity and diabetes research) were purchased from Harlan 
Winkelmann (33178 Borchen, Germany) and maintained under constant 
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temperature (preferrably 22°C) / 40 per cent humidity and a light / dark 
cycle of preferrably 14/10 hours. The mice were fed a standard chow (for 
example, from ssniff Spezialitaten GmbH, order number ssniff M-Z 
V1 126-000). For the fasting experiment ("fasted wild type mice"), wild 

5 type mice were starved for 48 h without food, but only water supplied ad 
libitum (see, for example, Schnetzler et al., (1993) J Clin Invest 
92(1):272-280, Mizuno et al. f (1996) Proc Natl Acad Sci U S A 
93(8):3434-3438). Animals were sacrificed at an age of 6 to 8 weeks. The 
animal tissues were isolated according to standard procedures known to 

10 those skilled in the art, snap frozen in liquid nitrogen and stored at -80 °C 
until needed. 

RNA was isolated from mouse tissues using Trizol Reagent (for example, 
from Invitrogen, Karlsruhe, Germany) and further purified with the RNeasy 

15 Kit (for example, from Qiagen, Germany) in combination with an 
DNase-treatment according to the instructions of the manufacturers and as 
known to those skilled in the art. Total RNA was reverse transcribed 
(preferrably using Superscript II RNaseH- Reverse Transcriptase, from 
Invitrogen, Karlsruhe, Germany) and subjected to Taqman analysis 

20 preferrably using the Taqman 2xPCR Master Mix (from Applied Biosystems, 
Weiterstadt* Germany; the Mix contains according to the Manufacturer for 
example AmpliTaq Gold DNA Polymerase, AmpErase UNG, dNTPs with 
dUTP, passive reference Rox and optimized buffer components) on a 
GeneAmp 5700 Sequence Detection System (from Applied Biosystems, 

25 Weiterstadt, Germany). 

Taqman analysis was performed preferrably using the following 
primer/probe pairs: 

30 For the amplification of Sac domain-containing inositol phosphatase 2 
(sac2) (SEQ ID NO: 1): 5'- CCT GGA TCG CAC CAA CG -3'; mouse sac2 
reverse primer (SEQ ID NO: 2): 5'- TTA AGC TGC TGT TCC ATG ACC A 
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-3'; Taqman probe (SEQ ID NO: 3): (5/6-FAM) TCC AGG CTG CCA TAG 
CGC GC (5/6-TAMRA) 

For the amplification of mouse solute carrier family 25 (mitochondrial 
carrier, Aralar) member 12 (Slc25a12) (SEQ ID NO: 4): 5'- CCT GCC AAC 
CCT GAT CAC A -3'; mouse Slc25a12 reverse primer (SEQ ID NO: 5): 5'- 
TTT CAA TGC CAG CGA AAG TG -3'; Taqman probe (SEQ ID NO: 6): 
(5/6-FAM) CGG TGG CTA CAG ACT TGC CAC GG (5/6-TAMRA) 

For the amplification of mouse solute carrier family 25 (mitochondrial 
carrier; adenine nucleotide translocator), member 13 (Slc25a13) (SEQ ID 
NO: 7): 5'- AGC GGT GGT TCT ATG TCG ATT T -3'; mouse Slc25a13 
reverse primer (SEQ ID NO: 8): 5'- CGG GAT TTA GGA ACC GGC T -3'; 
Taqman probe (SEQ ID NO: 9): (5/6-FAM) AGG CGT GAA GCC CGT GGG 
ATC T (5/6-TAMRA) 

For the amplification of mouse myelin gene expression factor 2 (mef2) 
(SEQ ID NO: 10): 5'- ACA AGG ATG GCA AGA GCA GAG -3'; mouse 
mef2 reverse primer (SEQ ID NO: 11): 5'- ATG GAA ATT GCT TGG ACT 
GCT T -3'; Taqman probe (SEQ ID NO: 12): (5/6-FAM) CAT GGG CAC 
TGT CAC TTT TGA GCA GG (5/6-TAMRA) 

In the figures the relative RNA-expression is shown on the Y-axis. In 
Figures 4A and B, 8A, B, C, and D, and 1 6A, B, and C, the tissues tested 
are given on the X-axis. "WAT" refers to white adipose tissue, "BAT" 
refers to brown adipose tissue. 

As shown in Figure 4A, real time PCR (Taqman) analysis of the expression 
of the Sac domain-containing inositol phosphatase 2 (SAC2) RNA in 
mammalian (mouse) tissues revealed that SAC2 is highly expressed in 
hypothalamus, brain, WAT, spleen and kidney. Figure 4B shows that SAC2 
is upregulated in BAT and pancreas of fasted animals as well as ob / ob 
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mice. The arcuate nucleus in the hypothalamus is the region in the brain 
that regulates feeding behaviour. The high expression level of SAC2 in the 
hypothalamus and WAT strongly suggests that this gene plays a central 
role in energy homeostasis. This is supported by the upregulation of SAC2 
in BAT and the pancreas of two animal models used to study metabolic 
disorders. 

As shown in Figure 8A, real time PCR (Taqman) analysis of the expression 
of the solute carrier family 25, member 12 (Slc25a12) RNA in mammalian 
(mouse) tissues revealed that Slc25a12 is highly expressed in muscle, 
hypothalamus, brain and heart. As shown in Figure 8B, Slc25a12 is nine- 
fold upregulated in BAT of ob /ob mice and more than two-fold upregulated 
in BAT of fasted animals. Slc25a12 is nearly three-fold downregulated in 
the heart of ob /ob mice. As shown in Figure 8C, solute carrier family 25, 
member 1 3 (Slc25a1 3) is highy expressed in liver, heart and kidney of wild 
type animals. As shown in Figure 8D, Slc25a13 is strongly upregulated in 
BAT of ob /ob mice and more than four-fold downregulated in heart tissue 
of ob /ob mice. The tissue specific expression of Slc25a12 and Slc25a13 
together with the clear regulation in BAT and heart in the genetic model for 
obesity, suggests that Slc25a12 and Slc25a13 play a central role in the 
metabolism. 

As shown in Figure 16A, real time PCR (Taqman) analysis of the 
expression of the myelin gene expression factor 2 (MEF-2) RNA in 
mammalian (mouse) tissues revealed that MEF-2 is highly expressed in 
hypothalamus, brain and testis. Furthermore it shows robust expression 
levels in WAT, colon, lung, spleen and kidney. Figure 16B shows that 
MEF-2 is upregulated ins BAT of ob / ob mice. Figure 16C shows that 
MEF-2 is also upregulated in BAT after high fat (palmitate) diet feeding. 
The upregulation of MEF-2 in BAT of a genetic model of obesity as well as 
under high fat diet suggests a central role for MEF-2 in metabolism. 
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Example 7. Analysis of the differential expression of transcripts of the 
proteins of the invention in human tissues 

RNA preparation from human primary adipose tissues was done as 
described in Example 6. The hybridization and scanning was performed as 
described in the manufacturer's manual (see Affymetrix Technical Manual, 
2002, obtained from Affmetrix, Santa Clara, USA). 

In Figures 12A and B, 20, 24, and 28, the X-axis represents the time axis, 
shown are day 0 and day 12 of adipocyte differentiation. The Y-axis 
represents the flourescent intensity. The expression analysis (using 
Affymetrix GeneChips) of the Quaking 6 (QKI6), RNA binding protein 
HQK-7B, RNA binding protein with multiple splicing (RBPMS), 
Peroxiredoxln 1 (PRDX1), and hypothetical protein LOC55565 genes using 
primary human abdominal adipocycte differentiation clearly shows 
differential expression of human QKI6, HQK-7B, RBPMS, PRDX1, and 
LOC55565 genes in adipocytes. Several independent experiments were 
done. The experiments show that the QKI6 (see Figure 1 2A), HQK-7B (see 
Figure 12B), and PRDX1 (see Figure 24) are most abundant at day 12 
compared to day 0 during differentiation. The experiments further show 
that the RBPMS (see Figure 20) and LOC55565 (see Figure 28) transcripts 
are most abundant at day 0 compared to day 12 during differentiation. 

Thus, the QKI6, HQK-7B, or PRDX1 proteins have to be significantly 
increased in order for the preadipocyctes to differentiate into mature 
adipocycte. The QKI6, HQK-7B, or PRDX1 prroteins in preadipocyctes have 
the potential to enhance adipose differentiation at a very early stage. The 
RBPMS or LOC55565 proteins have to be significantly decreased in order 
for the preadipocyctes to differentiate into mature adipocycte. Therefore, 
the RBPMS or LOC55565 proteins in preadipocyctes have the potential to 
inhibit adipose differentiation at a very early stage. 
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Therefore, QKI6, HQK-7B, RBPMS, PRDX1, and LOC55565 proteins might 
play an essential role in the regulation of human metabolism, in particular 
in the regulation of adipogenesis and thus it might play an essential role in 
obesity, diabetes, and/or metabolic syndrome. 

For the purpose of the present invention, it will understood by the person 
having average skill in the art that any combination of any feature 
mentioned throughout the specification is explicitly disclosed herewith. 



10 
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Claims 

1 . A pharmaceutical composition comprising a CG7956, aralarl, how, 
CG9373, cpo, Jafrad, or CG 14440 nucleic acid molecule 

or a polypeptide encoded thereby and/or a functional fragment 
thereof or an effector/modulator of said nucleic acid molecule and/or 
a polypeptide encoded thereby, preferably together with 
pharmaceutically acceptable carriers, diluents and/or additives. 

2. The. composition of claim 1, wherein the nucleic acid molecule is a 
vertebrate or insect CG7956, aralarl, how, CG9373, cpo, Jafrad, 
or CG14440 nucleic acid, particulary encoding a human protein as 
described in Table 1, and/or a nucleic molecule which is 
complementary thereto, or a functional fragment thereof or a variant 
thereof. 

3. The composition of claim 1 or 2, wherein said nucleic acid molecule 
is selected from the group consisting of 

(a) a nucleic acid molecule encoding a polypeptide as shown in 
Table 1 ; 

(b) a nucleic acid molecule which comprises or is the nucleic acid 
molecule as shown in Table 1; 

(c) a nucleic acid molecule degenerate as a result of the genetic 
code to the nucleic acid sequences as defined (a) or (b); 

(d) a nucleic acid molecule that hybridizes at 50°C in a solution 
containing 1 x SSC and 0.1 % SDS to a nucleic acid molecule 
as defined in claim 2 and/or a nucleic acid molecule which is 
complementary thereto; 

(e) a nucleic acid molecule that encodes a polypeptide which is 
at least 85%, preferably at least 90%, more preferably at 
least 95%, more preferably at least 98% and up to 99,6% 
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identical to a human protein as described in Table 1 or as 
defined in claim 2; and 
(f) a nucleic acid molecule that differs from the nucleic acid 
molecule of (a) to (e) by mutation and wherein said mutation 
5 causes an alteration, deletion, duplication or premature stop 

in the encoded polypeptide. 

4. The composition of any one of claims 1-3, wherein the nucleic acid 
molecule is a DNA molecule, particularly a cDNA or a genomic DNA. 

10 

5. The composition of any one of claims 1-4, wherein said nucleic acid 
encodes a polypeptide contributing to regulating the energy 
homeostasis and/or the metabolism of triglycerides, 

15 6. The composition of any one of claims 1-5, wherein said nucleic acid 
molecule is a recombinant nucleic acid molecule. 



7. The composition of any one of claims 1-6, wherein the nucleic acid 
molecule is a vector, particularly an expression vector. 

20 

8. The composition of any one of claims 1-5, wherein the polypeptide 
is a recombinant polypeptide. 



9. The composition of claim 8, wherein said recombinant polypeptide is 
25 a fusion polypeptide. 

1 0. The composition of any one of claims 1 -7, wherein said nucleic acid 
molecule is selected from hybridization probes, primers and 
anti-sense oligonucleotides. 

30 

1 1. The composition of any one of claims 1-10 which is a diagnostic 
composition. 
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12. The composition of any one of claims 1-10 which is a therapeutic 
composition, 

13. The composition of any one of claims 1-12 for the manufacture of 
5 an agent for detecting and/or verifying, for the treatment, alleviation 

and/or prevention of metabolic diseases or dysfunctions, including 
metabolic syndrome, obesity, and/or diabetes, as well as related 
disorders such as eating disorder, cachexia, hypertension, coronary 
heart disease, hypercholesterolemia, dyslipidemia, osteoarthritis, or 
10 gallstones, in cells, cell masses, organs and/or subjects. 

14. Use of a CG7956, aralarl, how, CG9373, cpo, Jafrad, or 
CG 14440 nucleic acid molecule, particularly of a nucleic acid 
molecule according to claim 3 (a), (b) or (c), or a polypeptide 

15 encoded thereby or a functional fragment or a variant of said nucleic 

acid molecule or said polypeptide and/or an effector/modulator of 
said nucleic or polypeptide for the manufacture of a medicament for 
the treatment of obesity, diabetes, and/or metabolic syndrome for 
controlling the function of a gene and/or a gene product which is 

20 influenced and/or modified by a CG7956, aralarl, how, CG9373, 

cpo, Jafrad , or CG1 4440 polypeptide, particularly by a polypeptide 
according to claim 3. 



15. Use of a CG7956, aralarl, how, CG9373, cpo, Jafrad, or 
25 CG 14440 nucleic acid molecule, particularly of a nucleic acid 

molecule according to claim 3(a), (b) or (c), or a polypeptide 
encoded thereby or a functional fragment or a variant of said nucleic 
acid molecule or said polypeptide or use of an effector/modulator of 
said nucleic acid molecule or said polypeptide for identifying 
30 substances in vitro capable of interacting with a CG7956, aralarl, 

how, CG9373, cpo, Jafrad, or CG 14440 polypeptide, particularly 
with a polypeptide according to claim 3. 



WO 03/092715 



PCT7EP03/04650 



- 68 - 

16. A non-human transgenic animal exhibiting a modified expression of 
a CG7956, aralarl, how, CG9373, cpo, Jafrad, or CG14440 
polypeptide, particularly of a polypeptide according to claim 3. 

5 17. The animal of claim 16, wherein the expression of the CG7956, 
aralarl, how, CG9373, cpo, Jafrad, or CG14440 polypeptide, 
particularly of a polypeptide according to claim 3, is increased 
and/or reduced. 

10 18. A recombinant host cell exhibiting a modified expression of a 
CG7956, aralarl, how, CG9373, cpo, Jafrad, or CG 14440 
polypeptide, particularly of a polypeptide according to claim 3. 

19. The cell of claim 18 which is a human cell. 

15 

20. A method of identifying a (poly)peptide involved in the regulation of 
energy homeostasis and/or metabolism of triglycerides in a mammal 
comprising the steps of 

(a) contacting a collection of (poly)peptides with a CG7956, 
20 aralarl, how, CG9373, cpo, Jafrad, or CG 14440 

polypeptide, particularly of a polypeptide according to claim 
3, or a functional fragment thereof under conditions that 
allow binding of said (poly)peptides; 

(b) removing (poly)peptides which do not bind and 

25 (c) identifying (poly)peptides that bind to said polypeptide. 

21 . A method of screening for an agent which modulates/effects the 
interaction of a CG7956, aralarl, how, CG9373, cpo, Jafrad, or 
CG 14440 polypeptide, particularly of a polypeptide according to 

30 claim 3, with a binding target, comprising the steps of 

(a) incubating a mixture comprising 
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(aa) a CG7956, aralarl, how, CG9373, cpo, Jafracl, or 
CG 14440 polypeptide, particularly of a polypeptide 
according to claim 3, or a functional fragment thereof; 

(ab) a binding target/agent of said polypeptide or functional 
fragment thereof; and 

(ac) a candidate agent 

under conditions whereby said polypeptide or functional 
fragment thereof specifically binds to said binding 
target/agent at a reference affinity; 

(b) detecting the binding affinity of said polypeptide or functional 
fragment thereof to said binding target to determine an 
affinity for the agent; and 

(c) determining a difference between affinity for the agent and 
the reference affinity. 



22. A method for screening for an agent, which modulates/effects the 
activity of a CG7956, aralarl, how, CG9373, cpo, Jafracl, or 
CG 14440 polypeptide, particularly of a polypeptide according to 
claim 3, comprising the steps of 

(a) incubating a mixture comprising 

(aa) said polypeptide or a functional fragment thereof and 

(ab) a candidate agent 

under conditions whereby said polypeptide or functional 
fragment thereof has a reference activity; 

(b) detecting the activity of said polypeptide or functional 
fragment thereof to determine an activity in the presence of 
the agent; and 

(c) determining a difference between the activity in the presence 
of the agent and the reference activity. 
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23. A method of producing a composition comprising the (poly)peptide 
identified by the method of claim 20 or the agent identified by the 
method of claim 21 or 22 with a pharmaceutical^ acceptable 
carrier, diluent and/or additive. 

24. The method of claim 23 wherein said composition is a 
pharmaceutical composition for preventing, alleviating or treating of 
metabolic diseases or dysfunctions, including metabolic syndrome, 
obesity, and/or diabetes, as well as related disorders such as eating 
disorder, cachexia, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, or gallstones. 

25. Use of a (poly)peptide as identified by the method of claim 20 or of 
an agent as identified by the method of claim 21 or 22 for the 
preparation of a pharmaceutical composition for the treatment, 
alleviation and/or prevention of metabolic diseases or dysfunctions, 
including metabolic syndrome, obesity, and/or diabetes, as well as 
related disorders such as eating disorder, cachexia, hypertension, 
coronary heart disease, hypercholesterolemia, dyslipidemia, 
osteoarthritis, or gallstones. 

26. Use of a nucleic acid molecule as defined in any of claims 1-6 or 10 
for the preparation of a medicament for the treatment, alleviation 
and/or prevention of metabolic diseases or dysfunctions, including 
obesity, diabetes, and/or metabolic syndrome, as well as related 
disorders such as eating disorder, cachexia, hypertension, coronary 
heart disease, hypercholesterolemia, dyslipidemia, osteoarthritis, or 
gallstones. 

27. Use of a polypeptide as defined in any one of claims 1 to 6, 8 or 9 
for the preparation of a medicament for the treatment, alleviation 
and/or prevention of metabolic diseases or dysfunctions, including 
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obesity, diabetes, and/or metabolic syndrome, as well as related 
disorders such as eating disorder, cachexia, hypertension, coronary 
heart disease, hypercholesterolemia, dyslipidemia, osteoarthritis, or 
gallstones. 

5 

28. Use of a vector as defined in claim 7 or the preparation of a 
medicament for the treatment, alleviation and/or prevention of 
metabolic diseases or dysfunctions, including obesity, diabetes, 
and/or metabolic syndrome, as well as related disorders such as 

10 eating disorder, cachexia, hypertension, coronary heart disease, 

hypercholesterolemia, dyslipidemia, osteoarthritis, or gallstones. 

29. Use of a host cell as defined in claim 1 8 or 1 9 for the preparation of 
a medicament for the treatment, alleviation and/or prevention of 

15 metabolic diseases or dysfunctions, including obesity, diabetes, 

and/or metabolic syndrome, as well as related disorders such as 
eating disorder, cachexia, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, or gallstones. 

20 30. Use of a CG7956, aralarl, how, CG9373, cpo, Jafrad, or 
CG 14440 nucleic acid molecule or of a functional fragment thereof 
for the production of a non-human transgenic animal which over- or 
under-expresses the CG7956, aralarl, how, CG9373, cpo, Jafrad, 
or CG 14440 gene product. 

25 

31 . Kit comprising at least one of 

(a) a CG7956, aralarl, how, CG9373, cpo, Jafrad, or CG 14440 
nucleic acid molecule or a functional fragment thereof; 

(b) a CG7956, aralarl , how, CG9373, cpo, Jafrad , or CG1 4440 
30 amino acid molecule or a functional fragment thereof; 

(c) a vector comprising the nucleic acid of (a); 
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(d) a host cell comprising the nucleic acid of (a) or the vector of 
(c); 

(e) a polypeptide encoded by the nucleic acid of (a); 

(f) a fusion polypeptide encoded by the nucleic acid of (a); 

(g) an antibody, an aptamer or another effector / modulator 
against the nucleic acid of (a) or the polypeptide of (b), (e) or 
(f) and 

(h) an anti-sense oligonucleotide of the nucleic acid of (a). 



10 
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Figure 1. Triglyceride content of a Drosophila CG7956 (GadFIy Accession Number) 
mutant 
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Figure 3, BLASTP results for CG7956 (GadFIy Accession Number) 
Homology to human protein NP_055752.1 (GenBank Accession Number) 

ref |NP__055752.1| (NBL.014937 ) KIAA0966 protein [Homo sapiens] 
Length = 1132 

Score = 573 bits (1477) , Expect = e-162 

Identities = 354/972 (36%), Positives = 514/972 (52%), Gaps = 114/972 (11%) 

Query: 1 MEVFQTDSHYIFVKRDKSLWWHRRTSEFSIKAGTO 60 

ME+FQ HYI + +++LW RR ++ DL + C+G+ G++G I L + 

Sbjct: 1 IffiLFQAKDHYILQQGERALWCSRIlDGGLQLRPATDLLtiAWNPICLGLVEGVIGKIQLHSD 60 

Query: 61 YEPHLVWKEASAVGVLYPPHLVYKIKSICILSADD PDTDLPNCTKHTKSNQSTPTH 117 

L+++++ + VG L H V K+ I +LS + D +L C KH 
Sbjct: 61 LPI^LILIRQKALVGKLPGDHEIVCKvTKIAVLSLSEMEPQDLELELCKKH 110 

Query: 118 SVSTSNNNNASVPSSGGGSSKSTKLFEGMNKTWGAVKSAGNT IKNTTQQAANLATKQ 174 

G+NK + S ++ +K T +N+ + 
Sbjct: 111 HFGINKPEKI I PS PDDSKFLLKTFTHIKSNVS APN 145 

Query: 175 VKSSVGIREPRHIERRITEELHKIFDETDSFYFSFDCDITNNLQRHEAKSEESQ SQP 231 

K +E +ERR+ EEL K+F +++SFY+S D+TN++QR + + + 

Sbjct: 146 KKKVKESKEKEKLERRLLEELLKMFMDSESFYYSLTYDLTNSVQRQSTGERDGRPLWQKV 205 

Query: 232 DERFFWNKHMIRDLINLNDKT WILPIIQGFMQVENCVIG 270 

D+RFFWNK+MI+DL + WI+P+IQGF+Q+E V+ 

Sbjct: 206 DDRFFWNKYMIQDLTEIGTPDVDFWI I PMIQGFVQIEELWNYTES SDDEKSS PET PPQE 265 

Query: 271 NEC FTLALVSRRSRHRAGTRYKRRGVDEKGNCANYVETEQI LS FRHHQLS FTQ 323 

+ C F +AL+SRRSRHRAG RYKRRGVD+ GN ANYVETEQ++ +H LSF Q 

Sbjct: 2 66 STCVDDIHPRFLVALISRRSRHR&GMRYKRRGTO 325 

Query: 324 WGSVPIYWSQPGYKYRPPPRLDRG^AETQQAFELHFTKELETYGRVCIVNLVEQSGKEK 383 

RGSVP++WSQ GY+Y P PRLDR ET F HF ++L Y + I+NLV+Q+G+EK 
Sbjct: 326 TRGSVPVFWSQVGYRYNPRPRLDRSEKETVAYFCAHFEEQLNIYKKQVIINLVDQAGREK 385 

Query: 3 84 TIGDAYADHVTKLNNDRL I YVTFDFHDYCRGMRFENVSALI DAVGPEAGAMGFHWRDQRG 443 

IGDAY V+ NN L YV+ FDFH+ + CRGM+ FENV L DA+ M + W D+ G 

Sbjct: 386 1 1 GDAYLKQVLLFNNSHLTYVSFDFHEHCRGMKFENVQTLTDAI YDI I LDMKWCWVDEAG 445 

Query: 444 MICNQKS WRVNCMDCLDRTNVVQTAIGKAVLESQLVKLGLS P PYTPI PEQLKS PFMVLW 503 

+IC Q+ +FRVNCMDCLDRTNWQ AI + V+E QL KLG+ PP P+P + + ++W 
Sbjct: 446 VICKQEGIFRVNCMDCLDRTNWQAAIARVVMEQQLKKLGVMPPEQPLP 505 

Query: 504 ANNGDI I SRQYAGTNALKGDYTRTGERKISGMMKDGMNSAl^YYLARFKDS YRQATIDLM 563 

ANNGD ISRQYAGT ALKGD+TRTGERK++G+MKDG+NSANRYYL RFKD+YRQA IDLM 
Sbjct: 506 ANNGDS I SRQYAGTAALKGDFTRTGERKLAGVMKDGVNS ANRYYLNRFKDAYRQAVIDLM 565 

Query: 564 LGNQVS SESLS ALGGQAGPD ENDGTENAEQAKLLVEDCRRLLLGTAQYPVGAWGLID 620 

GV++S + + ++ +E L++ +LLL + G W LID 
Sbjct: 566 QGI PVTEDLYS I FTKEKEHEALHKENQRSHQEL I SQLLQS YMKLLLPDDEKFHGGWALID 625 

Query: 621 ADPSSGDINETEVDTILLLTDDCYIVAXHDSHLDKIVTIFEKVQLT^ 680 

DPS D +VD +LLL++ Y VA YD +DK+ +++++ L + IE+G + + 
Sbjct: 626 CDPSLIDATHRDVBVLLLLSNSAYYVAYYDDEVDKVNQYQRLSLENLEKIEIG — PEPTL 683 
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Query: 681 FQGSAPAHLCLRLNYSTOEQEGYFHMFRSAI^R^ 740 

F P C+RL+Y E GYFH R A + +E+ +++ I EM + 

Sbjct : 684 F — GKPKFSCMRLHYRYKEASGYFHTLR AVMRNPEEDGKDTLQC I AEMLQ 731 

Query: 741 IALDNAGNTEVRYITGGVLQRRKSKLPTLDV PRGMPRNL S ESQLVQLSSKA 791 

I G+ I L+R+ SK P D+ +N S+ L+ K 

Sbjct: 732 ITKQAMGSD — LPIIEKKLERKSSK-PHEDIIGIRSQNQGSLAQGKNFIjMSKFSSLNQKV 788 

Query: 792 LSNMA GQFSKLGQTFKKPQAHPSSLAATMNPQVMRQRDSEIESGQEAEKAVFTLGR 847 

+ G KLG F KP+ + Ii + + + DS +E+ + V + 

Sbjct: 789 KQTKSNVNIGNLRKLG-NFTKPEMKVNFIjKPNLKVI^V^S-DSSLETMENT — GVMDKVQ 844 

Query: 848 KHRNSNS AS STDTDEHDNSLYEP EVDSDVEI AMDKSNYNE- NAFL PS VGI VMG NQK 902 

+ + 4-SD+ DL +DD ++A + + LPS GI+ + 

Sbjct: 845 AESDGDMSSDNDSYHSDEFLTNSKSDEDRQLANSLESVGPIDYVLPSCGIIASAPRIjGSR 904 

Query: 903 EDSPSSSDEIRH 914 

S SS+D H 
Sbjct: 905 SQSLSSTDSSVH 916 
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Figure 5. Triglyceride content of a Drosophila aralar 1 (GadFly Accession Number 
CG2139) mutant 
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Figure 7. Homology of Drosopila aralar 1 (GadFIy Accession Number CG2139) to 
human solute carrier family 25, members 12 and 13 

Figure 7A. BLASTP results for aralar 1 

Homology to human protein XP_010876*3 (GenBank Accession Number) 

ref |XP_010876.3| (XML_010876) solute carrier family 25 (mitochondrial carrier, 
Aralar) , member 12 [Homo sapiens] 
Length =678 

Score = 741 bits (1913), Expect =0.0 

Identities = 382/650 (58%), Positives = 488/650 (74%), Gaps = 14/650 (2%) 

Query: 1 MTSEDFVIUCFLGLFSESAFNDESVRLLANIM)TSKDGLISFSEFQAFEGLLCTPDAIjYRT 60 

MT EDFV+++LGL+++ N + V+LLA +AD +KDGLIS+ EF AFE +LC PD+++ 
Sbjct: 34 OTPEDFVQRYLGLYITOPNSNPKIVQLLAGVADQTKDGLISYQEFLAFESVLCAPDSMFIV 93 

Query: 61 AFQLFDRKGNGTVSYADFADVVQKTELHSKIPFSLDGPFIKRYFGDKKQRLINYAEFTQL 120 

AFQLFD+ GNG V++ + ++ +T +H IPF+ D FI+ +FG +++ +NY EFTQ 
Sbjct: 94 AFQLFDKSGNGEVTFENVKEIFGQTIIHHHIPFNWDCEFIRLHFGH^^ 153 

Query: 121 LHDFHEEHAMEAFRSKDPAGTGFI S PLDFQDI IVNVKRHLLT PGVRDNLVS VTEG HK 177 

L + EHA +AF KD + +G IS LDF DI+V ++ H+LTP V +NLVS G H+ 
Sbjct: 154 LQELQLEHARQAFALKDKSKSGMI SGLDFSDIMVTIRSHMLTPFVEENLVS AAGGS I SHQ 213 

Query: 178 VSFPYFIAFTSLLNNMELIKQVYLHATEGSRTD^^ 236 

VSF YF AF SLLNNMEL++++Y G+R D+ +TK++ +A Q+TPLEIDIL+ 

Sbjct: 214 VSFSYFNAFNSLLNNMELVRKIY-STLAGTRKDVEVTKEEFAQSAIRYGQVTPLEIDILY 272 

Query: 237 HLAGAVHQAGRIDYSDLSNIAPEHYTKHMTHRLAEIKAVESPA-DRSAFIQVIiESSYRFT 295 

LA + +GR+ +D+ IAP + + LAE++ +SP R ++Q+ ES+YRFT 

Sbjct: 273 QLADLYNASGRLTL ADI ERI APLAEGA- LPYX^AELQRQQS PGLGRP I WLQI AESAYRFT 331 

Query: 296 LGS F AGAVGATVVYPI DLVKTRMQNQR- AGS YIGEVAYRNSWDC FKKVVRHEGFMGLYRG 354 

LGS AGAVGAT VYPIDLVKTRMQNQR +GS +GE+ Y+NS+DCFKKV+R+EGF GLYRG 
Sbjct: 332 LGSVAGAVGATAWPIDLVKTRMQNQRGSGSWGELMYKNSFDCFKKVIiRYEGFFGDYRG 391 

Query: 355 LLPQLMGVAPEKAIKLTV1TOLVRDKLTDKKGNI 414 

L+PQL+GVAPEKAIKLTVND VRDK T + G++P AEVLAGGCAG SQV+FTNPLEIVK 
Sbjct: 392 LI PQLIGVAPEKAIKLTVNDFVRDKFTRRDGS VPLPAEVLAGGC AGGS QVI FTNPLEI VK 451 

Query: 415 IRLQVAGEIASGSKIRAWSVVRELGLFGLYKGARACLLRDVPFSAIYFPTYAHTKAMMAD 474 

IRLQVAGEI +G ++ A +V+R+LG+ FGLYKGA+ AC LRD+ PFSAI YFP YAH K ++AD 
Sbjct: 452 IRLQVAGEI TTGPRVSALNVLRDLGI FGLYKGAKACFLRDI PFSAI YFPVYAHCKLLLAD 511 

Query: 475 KDGYNHPLTLLAAGAIAGVPAASLVTPADVIKTRLQWARSGQTTYTGVWDATKXIMAEE 534 

++G+ L LLAAGA+AGVPAASLVTPADVIKTRLQV AR+GQTTY+GV D +KI+ EE 
Sbjct: 512 ENGHVGGLNLLAAGAMAGVPAAS LVTP ADVI KTRLQVAARAGQTT YS GVIDCFRKI LREE 571 

Query: 535 GPRAFWKGTAARVFRSSPQFGVTLVTYELLQRLFYVDFGGTQPKGSEAHKITTPLEQAAA 594 

GP AFWKGTAARVFRS SPQFGVTLVTYELLQR FY+DFGG +P GSE TP + A 

Sbjct: 572 GPSAFWKGTAARVFRSSPQFGVTLVTYELLQRWFYIDFGGLKPAGSE PTP-KSRIA 626 

Query: 595 SVTTENVDH IGGYRAAVPLLAGVESKFGLYLPRF-GRGVTAAS PSTATGS 643 

+ N DHIGGYR A AG+E+KFGLYLP+F V P A + 
Sbjct: 627 DLPPANPDHIGGYRL ATAT FAGI ENKFGLYLPKFKS PSVAWQPKAAVAA 676 
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Homology to human protein NP_055066.1 (GenBank Accession Number) 

ref |NP_055066.l| (NM_014251) solute carrier family 25, member 13 (citrin) 
[Homo sapiens] 
Length = 675 

Score = 728 bits (1878), Expect = 0.0 

Identities = 374/643 (58%), Positives = 476/643 (73%), Gaps = 17/643 (2%) 

Query: 1 MTSEDFVRKFLGLFSESAFNDESVRLLANIADTSKDGLISFSEFQAFEGLLCTPDALYRT 60 

M+ DFV ++L +F ES N ++V LL+ + D +KDGLISF EF AFE +LC PDAL+ 
Sbjct: 35 MS PNDFVTRYLNI FGESQPNPKTVELL SGWDQTKDGL I SFQEFVAFESVLCAPDALFMV 94 

Query: 61 AFQLFDRKGNGTVSYADFADWQKTELHSKIPFSLDGPFIKRYFGDKKQRLINYAEFTQL 120 

AFQLFD+ G G V++ D V +T +H IPF+ D F++ +FG +++R + YAEFTQ 
Sbjct: 95 AFQLFDKAGKGEVTFEDVKQVFGQTTIHQHIPFK^ 154 

Query: 121 LHDFHEEHAMEAFRSKDP AGTGF I S PLDFQDI IVNVKRHLLT PGVRDNLVSVTEG HK 177 

L + EHA +AF +D A TG ++ +DF+DI+V ++ H+LTP V + LV+ G H+ 
Sbjct: 155 LLEIQLEHAKQAFVQRDNARTGRVTAIDFRDIMVTIRPHVLTPFVEECLVAAAGGTTSHQ 214 

Query: 178 VSFPYFIAFTSLLNNMELIKQVYLHATEGSRTDM-ITKDQILLAAQTMSQITPLEIDILF 236 

VSF YF F SLLNNMELI+++Y G+R D+ +TK++ +LAAQ Q+TP+E+DILF 

Sbjct: 215 VSFS YFNGFNS LLNNMELIRKI Y - STLAGTRKDVEVTKEEFVLAAQKFGQVT PMEVDI LF 273 

Query: 237 HLAGAVHQAGRIDYSDL SNI AP- EHYTKHMTHRLAEIKAVES PAD — RSAFI QVLES SYR 293 

LA GR+ +D+ IAP E T + LAE + ++ D R +QV ES+YR 

Sbjct: 274 QLADLYEPRGRMTL ADI ERI APLEEGT — LPFNLAEAQRQKASGDSARFVLLQVAES AYR 331 

Query: 294 FTLGSFAGAVGATVVYPIDLVKTRMQNQRA-GSYIGEVAYRNSVTOCFKKVVRHEGFMGLY 352 

F LGS AGAVGAT VYPIDLVKTRMQNQR+ GS++GE+ Y+NS+DCFKKV+R+EGF GLY 
Sbjct: 332 FGLGSVAGAVGATAVYPIDLVKTRMQNQRSTQSFVGELMYKNSFDCFKKVLRYEGFFGLY 391 

Query: 353 RGLLPQLMGVAPEKAIKLTVNDLV^ 412 

RGLLPQL+GVAPEKAIKLTVND VRDK K G++P AE+LAGGCAG SQV+ FTNPLEI 
Sbjct: 392 RGLLPQLLGVAPEKAIKLTVITOFVEIDKFMHKDGSVPL AAEI LAGGCAGGSQVI FTNPLEI 451 

Query: 413 VKIRLQVAGEIASGSKIRAWSVVRELGLFGLYKGARACLLRDVPFSAIYFPTYAHTKAMM 472 

VKIRLQVAGE I +G ++ A SWR+LG FG+YKGA+AC LRD+PFSAIYFP YAH KA 
Sbjct: 452 VKIRLQVAGEITTGPRVSALSWRDLGFFGIYKGAKACFLRDIPFSAIYFPCYAHVKASF 511 

Query: 473 ADKDGYl^PLTLLAAGAIAGVPAASLVTPADVIKTRLQWARSGQTTYTGVWDATKKIMA 532 

A++DG P +LL AGAI AG+ PAASLVTPADVIKTRLQV AR+GQTTY+GV D +KI + 
Sbjct: 512 ANEDGQVSPGSLLLAGAI AGMPAASLVT PADVIKTRLQVAARAGQTTYSGVIDCFRKI LR 571 

Query: 533 EEGPRAFWKGTAARVFRS S PQFGVTLVTYELLQRLFYVDFGGTQPKGS EAHKITTPLEQA 592 

EEGP+A WKG ARVFRS S PQFGVTL+TYELLQR FY+DFGG +P GSE P+ + + 

Sbjct: 572 EEGPKALWKGAGARVFRS S PQFGVTLLTYELLQRWFYIDFGGVKPMGS E PVPKS 625 

Query: 593 AASVTTENVDHIGGYRAAVPLLAGVESKFGLYLPRFGRGVTAA 63 5 

++ N DH+GGY+ AV AG+E+KFGLYLP F V+ + 
Sbjct: 626 RINLPAPNPDHVGGYKLAVATFAGIENKFGLYLPLFKPSVSTS 668 
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Figure 7B. Multiple Sequence Alignment (ClustalW 1.83) 

aralarl Dm MPLTKSLPNSPSLLKRAGTEKLR^ 

SLC25A12 Hs MAVKVQTTKRGDPHELRNIFLQYASTEVDGERYMTPED^ 

SLC25A13 Hs MAAAKVALTKRADPAELRTIFLKYAS I EKNGEFFMS PNDFVTRYLNIFGES QPNP 

aralarl Dm ESVRLLANIADTSKDGLISFSEFQAFEGLLCTPDALYRTAFQLFDRKGNGTVSYADFADV 
SLC25A12 HS KIVQIjI*AGVADQTKDGLISYQEFLAFESVLCAPDSMFIVAFQLFDKSGNGEVTFENVKEI 
SLC25A13 HS KTVELLSGKATDQTKDGLISFQEFVAFESVLCAPDALFW^ 

aralarl Dm VQKTELHSKI PF SLDGPFIKRYFGDKKQRLINYAEFTQLLHDFHEEHAMEAFRSKDPAGT 
SLC25A12 HS FGQTI IHHHI PFNWDCEF IRLHFGHNRKKHLNYTEFTQFLQELQLEHARQAFALKDKSKS 
SLC25A13 HS FGQTT IHQHI PFNWDSEFVQLHFGKERKRHLTYAEFTQFLLEI QLEHAKQAFVQRDNART 

aralarl Dm GFI SPLDFQDI I VNVKRHLLT PGVRDNLVSVTEG HKVSFPYFIAFTSLLNNMELIKQ 

SLC25A12 Hs GMI SGLDF SDIMVTIRSHMLTPFVEENLVSAAGGSI SHQVSFSYFNAFNSLLNNMELVRK 
SLC25A13 Hs GRVTAIDFRDIM\^IRPHVLTPFVEECLVAAAGGTTSHQVSFSYFNGFNSIjIjNNMELIRK 

aralarl Dm VYIiHATEGSRTDMITKDQIIiIiAAQTMSQITPLEIDILFHLAGAVHQAGRIDYSDLSNIAP 
SLC25A12 Hs IYSTLAGTRKDVEVTKEEFAQSAIRYGQVTPLEIDILYQLADIiYNASGRLTLADIERIAP 
SLC25A13 Hs IYSTLAGTRKDVEVTKEEFVLAAQKFGQVTPMEVDILFQLADLYEPRGRMTLADIERIAP 

aralarl Dm EHYTKHMTHRLAEIKAVESPA- -DRS AFI QVLES SYRFTLGSFAGAVGATWYP I DLVKT 
SLC25A12 HS LAEG- ALP YNLAELQRQQS PG- LGRPIWLQI AES AYRFTLGSVAGAVGAT AVYP IDLVKT 
SLC25A13 HS LEEG- TLPFNLAEAQRQKASGDS ARPVLLQVAES AYRFGLGS VAGAVGAT AVYP IDLVKT 

aralarl Dm RMQNQR-AGSYIGWAYRNSWDCFKKVVRHEGFMGLYRG 

SLC25A12 Hs RMQNQRGSGSVVGELMYKNSFDCFKKVLRYEGFFGLYRGL I PQLI GVAPEKAIKLTVNDF 
SLC25A13 Hs RMQNQRSTGSFVGELMYKNSFDCFKKVLRYEGFFGLYRGLLPQLLGVAPEKAIKLTVNDF 

aralarl Dm VRDKLTDKKGNI PTWAEVL AGGC AGASQWFTNPLEIVKI RLQVAGEI AS GSKIRAWSW 
SLC25A12 Hs VRDKFTRRDGSVPLPAEVLAGGC AGGSQVT FTNPLEI VKI RLQVAGEITTGPRVS ALNVL 
SLC25A13 HS VRDKFMHKDGS VPLAAEILAGGCAGGS QVT FTNPLEIVKI RLQVAGEITTGPRVS ALSW 

aralarl Dm RELGLFGLYKGARACLLRDVPFSAIYFPTYAHTKAMMADKDGYNHPLTLLAAGAIAGVPA 
SLC25A12 Hs RDLGIFGLYKGAKAGFLRDIPFSAIYFPWAHCKLLLADENGHVGGLl^LAAGAMAGVPA 
SLC25A13 HS RDLGFFGI YKGAKACFLRDI PFS AI YFPCYAHVKASFANEDGQVS PGSLLLAGAI AGMPA 

aralarl Dm ASLVT PADVI KTRLQVVARSGQTT YTGVTWDATKKIMAEEGPRAFWKGTAARVFRS S PQFG 
SLC25A12 HS ASLVT PADVIKTRLQVAARAGQTTYSGVIDCFRKILREEGPSAFWKGTAARVFRS S PQFG 
SLC25A13 HS ASLVTPADVIKTRLQVAARAGQTTYS GVTDCFRKI LREEGPKALWKGAGARVFRS SPQFG 

aralarl Dm VTLVTYELLQRLFYVDFGGTQPKGSEAHKITTPLEQAAASVTTENVDHIGGYRAAVPLLA 

SLC25A12 Hs VTLVTYELLQRWFYIDFGGLKPAGSEP TPKSRI AD- LPPANPDHIGGYRL ATATFA 

SLC25A13 Hs VTLLTYELLQRWFYIDFGGVKPMGSEP VPKSRIN — LPAPNPDHVGGYKLAVATFA 

aralarl Dm GVESKFGLYLPRFGRGVTAAS PSTATGS 

SLC25A12 HS GIENKFGL YLPKFKS PSVAWQPKAAVAATQ 
SLC25A13 Hs GIENKFGLYLPLFK-PSVSTSKAIGGGP 
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Figure 9. Triglyceride content of a Drosophila how (GadFly Accession Number 
CG10293) mutant 
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EP-control HD-EP30815 
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Figure 11. Homology of DrosopUa how (GadFIy Accession Number CG10293) to human 
Quaking isoforms 

Figure 11A. BLASTP results for CG10293 (GadFIy Accession Number) 

gb|AAP63416.l|AF142421_l (AF142421) QUAKING isoform 5 [Homo sapiens] 
Length = 337 

Score = 289 bits (739) , Expect - 5e-77 

Identities = 168/334 (50%), Positives = 215/334 (64%), Gaps = 20/334 (5%) 

Query: 67 QQQQSTQSIADYLAQLLKDRKQLAAFPN VFTHVERLLDEEIARVRASLFQ — ING-V 120 

+ ++ + DYL QL+ D+K +++ PN +F H+ERLLDEEI+RVR ++ +NG 
Sbjct: 2 ETKEKPKPTPDYLMQLMNDKKIiMSSLPI^CGIFNHLERIjLDEEISRVRKDMYNDTLNGST 61 

Query: 121 KKEPLTLPEPEGSVVTMNEKVYVPVREH PDFNFVGRI LGPRGMTAKQLEQETGCKIMVRG 180 

+K LP+ G +V + EK+ YVPV+ E+ PDFNFVGRI LGPRG+TAKQLE ETGCKIMVRG 
Sbjct: 62 EKRSAELPDAVGPIVQLQEKLWPVKEYPDFNFV 121 

Query: 181 KGSMRDKXKEDANRGKPNWEHLS DDLHVL I WEDT ENRATVKL AQ AVAEVQKLLVPQAEG 240 

KGSMRDKKKE+ NRGKPNWEHL+ +DLHVLITVED +NRA +KL +AV EV+KLLVP AEG 
Sbjct: 122 KGSBflRDKKKEEQNRGKPNWEHLNEDL^ 1B1 

Query: 241 EDEIiKKRQLMELAI INGTYRDTT AK PGLAAQIRA 3 00 

ED LKK QLMELAI+NGTYRD KS A+ A + R++T A +R 

Sbjct: 182 EDS LKKMQLMELAI LNGTYRDANIKS P ALAF S — LAATAQAAPRI ITGPAPVLPPAALRT 239 

Query: 301 PA-AAPLGAPLILNPRMTVPTTAASILSAQAAPTAAFDQTG — HGMIFAPYDYANYAALA 357 

PAP PLI + V + + PTAA G G+I+ PY+Y Y 

Sbjct: 240 PTPAGPTIMPLIRQIQTAV MPNGTPHPTAAIVPPGPEAGLIYTPYEYP-YTLAP 292 

Query: 358 GNPLLTEYADHS — VGAIKQQRRLATNREHPYQR 389 

+L + S +GA+ + R R HPYQR 
Sbjct: 293 ATS ILEYPI EPSGVLGAVATKVRRHDMRVHP YQR 326 

re£ |XP_037438 .2 | (XH_037438) similar to KH domain RNA binding protein QKI-5A 
[Homo sapiens] , Length = 341 

Score = 289 bits (739), Expect = 5e-77 

Identities = 168/334 (50%), Positives = 215/334 (64%), Gaps = 20/334 (5%) 

Query: 67 QQQQSTQS I ADYLAQLLKDRKQLAAFPN VFTHVERLLDEEIARVRASLFQ — ING-V 120 

+ ++ + DYL QL+ D+K +++ PN +F H+ERLLDEEI+RVR ++ +NG 
Sbjct: 6 ETKEKPKPTPDYLMQLMNDKKLMSSLPNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGST 65 

Query: 121 KKEPLTLPEPEGSVVTMNEKVYVPVREHPDFNFVGRILGPRGMTAKQLEQETGCKIMVRG 180 

+K L p + Q +V + EK+ YVPV+ E+PDFNFVGRILGPRG+TAKQLE ETGCKIMVRG 
Sbjct: 66 EKRSAELPDAVGPIVQLQEKLYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGCKIMVRG 125 

Query: 181 KGSMRDKKKEDANRGKPNWEHLSDDLHVLITVEDTENRATVKLAQAVAI^ 240 

KGSMRDKKKE+ NRGKPNWEHL + +DLHVLITVED +NRA +KL +AV EV+KLLVP AEG 
Sbjct: 126 KGSMRDKKKEEQNRGKPNWEHLNEDLHVLITVEDAQNRAEIKLKRAVEEVKKLL 185 

Query: 241 EDELKKRQLMELAIINGTYRDTTAKSVAVCDEEWRRLVAASDSRLLTSTGLPGLAAQIRA 300 

ED LKK QLMELAI +NGTYRD KS A+ A + R++T A +R 

Sbjct: 186 EDSLKKMQLMELAI LNGTYRDANIKS PALAFS — LAATAQAAPRI ITGPAPVLPPAALRT 243 



WO 03/092715 



19/51 



PCT/EP03/04650 



Query: 301 PA-AAPLGAPLILNPRMWPTTAASILS 357 

p A P PLI + V + + PTAA G G+I+ PY+Y Y 

Sbjct: 244 PTPAGPTIMPLIRQIQTAV MPNGT PHPTAAIVPPGPEAGLI YTP YEYP - YTLAP 296 

Query: 358 GNPLLTEYADHS — VGAIKQQRRL ATNREHPYQR 389 

+L + S +GA+ + R R HPYQR 
Sbjct: 297 ATSILEYPIEPSGVLGAVATKVRRHDMRVHPYQR 330 

gb| AAF63414.l|AFl42419_l (AF142419) QUAKING isoform 6 [Homo sapiens] 
Length = 363 

Score = 289 bits (739), Expect = 5e-77 

Identities = 168/334 (50%) , Positives = 215/334 (64%), Gaps = 20/334 (5%) 

Query: 67 QQQQSTQSIADYLAQLLKDRKQLAAFPN VFTHVERLLDEEI ARVRASLFQ- - ING-V 120 

+ ++ + DYL QL+ D+K +++ PN +F H+ERLLDEEI+RVR ++ +NG 
Sbjct: 28 ETKEKPKPT PDYIjMQLMNDKKLMS SLPNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGST 87 

Query: 121 KKEPLTL PEPEGSVVTMNEKVYVFVREH PDFNFVGRI LGPRGMTAKQLEQETGCKIMVRG 180 

+ K LP+ G +V + EK+YVPV+E+ PDFNFVGRI LGPRG+TAKQLE ETGCKIMVRG 
Sbjct: 88 EKRSAELPDAVGPIVQLQEKLYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGCKIMVRG 147 

Query: 181 KGSMRDKKKEDANRGKPNWEHLSD 240 

KGSMRDKKKE+ NRGKPNVJEHL+ +DLHVLITVED +NRA +KL 4-AV EV+KLLVP AEG 
Sbjct: 148 KGSMRDKKKEEQNRGKPNWEHLNEDLHVLITVEDAQNRAEIKLKRAVEEVKKL 207 

Query: 241 EDEDKKRQLMELAIINGTYRDTTAKSVAVC^ 3 00 

ED LKK QLMELAI +NGTYRD KS A+ A + R++T A +R 

Sbjct: 208 EDSLKKMQLMEL AI LNGT YRDANIKS P ALAFS — LAATAQAAPRI ITGPAPVLPPAALRT 265 

Query: 301 PA-AAPLGAPLILNPRMTVPTTAASILSAQAAPTAAFDQTG — HGMI F AP YD Y ANY AAL A 357 

PAP PLI + V + + PTAA G G+I+ PY+Y Y 

Sbjct: 266 PTPAGPTIMPLIRQIQTAV MPNGTPHPTAAIVPPGPEAGLIYTPYEYP- YTLAP 318 

Query: 358 GNPLLTEYADHS— VGAIKQQRRLATNREHPYQR 389 

+L + S +GA+ + R R HPYQR 

Sbjct: 319 AT S ILEYPI EPSGVLGAVATKVRRHDMRVHPYQR 3 52 



dbj |BAB55032 .1| (AK027309) unnamed protein product [Homo sapiens] 
Length = 323 

Score = 282 bits (722), Expect = 5e-75 

Identities = 165/320 (51%), Positives = 208/320 (64%), Gaps = 20/320 (6%) 

Query: 81 QLLKDRKQLAAFPN VFTHVERLLDEEIARVRASLFQ — ING-VKKEPLTLPEPEGSV 134 

QL+ D+K + + + PN +F H+ERLLDEEI+RVR ++ +NG +K LP+ G + 
Sbjct: 2 QLMNDKKLMS SL PNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGSTEKRS AELPDAVGPI 61 

Query: 135 VTMNEKVWPVREHPDFNFVGRILGPRGMTAKQLEQETGCKIMVRGKGS^^ 194 

V + EK+ YVPV+ E+ PDFNFVGRI LGPRG+TAKQLE ETGCKIMVRGKGSMRDKKKE+ NR 
Sbjct: 62 VQLQEKLYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGCKIMVRGKGSMRDKKKEEQNR 121 

Query: 195 GKPNWEHLSDDLHVLITVTSDTENRATVKLAQAVAEV^ 254 

GKPNWEHL + +DLHVLITVED +NRA +KL +AV EV+KLLVP AEGED LKK QLMELAI 
Sbjct: 122 GKPNVreHLNEDLHVLITVEDAQNRAEIKLKRAV^ 181 
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Query: 255 INGTYRDTTAKSVAVCDEEWRRLVAASDSRtiLTSTGLPGIiAAQIRAPA-AAPLGAPLILN 313 

+NGTYRD KS A+ A + R++T A +R P A P PLI 

Sbjct: 182 LNGT YRDANIKS PALAFS — LAATAQAAPRI ITGPAPVLPPAALRTPTPAGPTIMPLIRQ 239 

Query: 314 PRMTVPTTAASILSAQAAPTAAFDQTG- -HGMIFAPYDYANYAALAGNPLLTEYADHS- - 3 69 

+ V + + PTAA G G+I + PY+Y Y +L + S 

Sbjct: 240 IQTAV MPNGTPHPTAAIVPPGPEAGLIYTPYEYP-YTLAPATSILEYPIEPSGV 292 

Query: 370 VGAIKQQRRLATNREHPYQR 389 

+GA+ + R R HPYQR 
Sbjct: 293 LGAVATKVRRHDMRVHPYQR 312 

gb|AAF63413.l|AFl42418_l (AF142418) QUAKING isoform 2 [Homo sapiens] 
Length =347 

Score = 280 bits (716) , Expect = 2e-74 

Identities = 156/293 (53%), Positives = 198/293 (67%), Gaps = 17/293 (5%) 

Query: 67 QQQQSTQSIADYLAQLLKDRKQLAAFPN VFTHVERLLDEEI ARVRASLFQ- - ING-V 120 

+ ++ + DYL QL+ D+K +++ PN +F H+ ERLLDEEI +RVR ++ +NG 
Sbjct: 28 ETKEKPKPTPDYLMQLMNDKKLMSSLPNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGST 87 

Query: 121 KKEPLTL PE PEGSVVTMNEKVYVPVREHPDFNFVGRI LGPRGMTAKQLEQETGCKIMVRG 180 

+K LP+ G +V + EK+ YVPV+ E+ PDFNFVGRI LGPRG+ TAKQLE ETGCKIMVRG 
Sbjct: 88 EKRSAELPDAVGPIVQLQEKLYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGCKIMVRG 147 

Query: 181 KGSMRDKKKEDANRGKPNWEHLSDDLHVLITVED 240 

KGSMRDKKKE+ NRGKPNWEHL+ +DLHVLITVED +NRA +KL +AV EV+KLLVP AEG 
Sbjct: 148 KGSMRDKKKEEQNRGKPNWEHLNEDLHVLITVEDAQIX^^ 207 

Query: 241 EDEIiKKRQLMELAI INGTYRDTTAKSVAVCDEEWRRLVAASDSRLLT STGLPGLAAQIRA 3 00 

ED LKK QLMELAI +NGTYRD KS A+ A + R++T A +R 

Sbjct: 208 EDSLKKMQLMELAILNGTYRDANIKS PALAFS — IiAATAQAAPRI ITGPAPVLPPAALRT 265 

Query: 301 PA- AAPLGAPLILNPRMTVPTTAAS I LS AQAAPTAAFDQTG — HGMIF APYDY 350 

P A P PLI + V + + PTAA G G+I+ PY+Y 

Sbjct: 266 PTPAGPTIMPL IRQ IQTAV MPNGTPHPTAAIVPPGPEAGLIYTPYEY 312 

gb|AAF63417.l|AFl42422_l (AF142422) QUAKING isoform 3 [Homo sapiens] 
Length = 341 

Score = 280 bits (716), Expect = 2e-74 

Identities = 156/293 (53%), Positives = 198/293 (67%), Gaps = 17/293 (5%) 

Query: 67 QQQQSTQSIADYLAQLLKDRKQLAAFPN VFTHVERLLDEEIARVRASLFQ — ING-V 120 

+ ++ + DYL QL+ D+K +++ PN +F H+ERLLDEEI+RVR ++ +NG 
Sbjct: 28 ETKEKPKPTPDYLMQLMNDKKLMSSLPNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGST 87 

Query: 121 KKEPLTLPEPEGSVWMNEKVYVPWEHPD 180 

+K LP+ G +V + EK+ WPV+ E+ PDFNFVGRI LGPRG+ TAKQLE ETGCKIMVRG 
Sbjct: 88 EKRSAELPDAVGPIVQLQEKLYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGCKIMVRG 147 

Query: 181 KGSMRDKKKEDANRGKPNWEHLSDDLHVLIWEDTENRATV^ 240 

KGSMRDKKKE+ NRGKPNWEHL+ +DLHVLITVED +NRA +KL +AV EV+KLLVP AEG 
Sbjct: 148 KGSMRDKKKEEQNRGKPNWEHLNEDLHVLITVEDAQNRAEIKL 207 
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Query: 241 EDELKKRQLMELAI INGTYRDTTAKS VAVCDEEWRRLVAASDSRLLT STGLPGLAAQ IRA 300 

ED LKK QLMELAI +NGTYRD KS A+ A + R++T A +R 

Sbjct: 208 EDSLKKMQLMELAILNGTYRDANIKSPALAFS — LAATAQAAPRI ITGPAPVLPPAALRT 265 

Query: 301 PA- AAPLGAPLILNPRMTVPTTAAS I L S AQAAPTAAFDQTG- - HGMI FAPYDY 350 

PAP PLI + V + + PTAA G G+I+ PY+Y 

Sbjct: 266 PTPAGPTIMPLIRQIQTAV - -MPNGTPHPTAAIVPPGPEAGLI YTPYEY 312 

gb|AAF63415.l|AF142420_l (AF142420) QUAKING isoform 4 [Homo sapiens] 
Length =315 

Score = 280 bits (716), Expect = 2e~74 

Identities = 156/293 (53%), Positives = 198/293 (67%), Gaps = 17/293 (5%) 

Query: 67 QQQQSTQSIADYLAQLLKDRKQLAAFPN VFTHVERLLDEEIARVRASLFQ — ING-V 120 

+ ++ + DYIi QL+ D+K +++ PN +F H+ERLLDEEI+RVR ++ +NG 
Sbjct: 2 ETKEKPKPTPDYLMQLMNDKKLMSSLPNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGST 61 

Query: 121 KKEPLTLPEPEGSVVTMNEKVWPVREHPDFNFVGRILGPRGMTAK^ 180 

+K IiP+ G +V + EK+YVPV+E+PDFNFVGRILGPRG+TAKQLE ETGCK IMVRG 
Sb j ct : 62 EKRSAELPDAVGPIVQLQEKLYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGCKIMVRG 121 

Query: 181 KGSMRDKKKEDANRGKPNWEHLSDDLHVL I TVEDTENRATVKIAQAVAEVQKLLVPQAEG 240 

KGSMRDKKKE+ NRGKPNWEHL++DLHVLITVED +NRA +KL +AV EV+KLLVP AEG 
Sbjct: 122 KGSMRDKKKEEQNRGKPNWEHLNEDLHVIiITVEDAQN^ 181 

Query: 241 EDELKKRQLME^IINGTYRDTTAKSVAVCDEEWRRLVAASDSRLLTSTGLPGIiAAQIRA 300 

ED LKK QLMELAI +NGTYRD KS A+ A + R++T A +R 

Sbjct: 182 EDSLKKMQLMELAI LNGTYRDANIKS PALAFS — LAATAQAAPRI ITGPAPVLPPAALRT 239 

Query: 301 PA-AAPLGAPLILNPRMTVPTTAASILSAQAAPTAAFDQTG — HGMI FAPYDY 3 50 

PAP PLI + V + + PTAA G G+I+ PY+Y 

Sbjct: 240 PTPAGPTIMPLIRQIQTAV MPNGTPHPTAAIVPPGPEAGLIYTPYEY 286 



dbj |BAB69497.1| (AB067799) RNA binding protein HQK-6 [Homo sapiens] 
Length =319 

Score = 280 bits (716), Expect = 2e-74 

Identities = 156/293 (53%), Positives = 198/293 (67%), Gaps = 17/293 (5%) 

Query: 67 QQQQSTQSIADYLAQLLKDRKQLAAFPN VFTHVERLLDEEIARVRASLFQ — ING-V 120 

+ ++ + DYL QL+ D+K +++ PN +F H+ERLLDEEI+RVR ++ +NG 
Sbjct: 6 ETKEKPKPTPDYLMQLMNDKKLMS SL PNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGST 65 

Query: 121 KKEPLTLPEPEGSVVTMNEKVWPVTIEHPDFNFVG 180 

+K LP+ G +V + EK+ YVPV+ E+ PDFNFVGRILGPRG+TAKQLE ETGCKIMVRG 
Sbjct: 66 EKRSAELPDAVGPIVQLQEKLYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGCKIMVRG 125 

Query: 181 KGSMRDKKKEDANRGKPNWEHLSDDLHVLITVm)TENRATVl^ 240 

KGSMRDKKKE+ NRGKPNWEHL+ +DLHVLITVED +NRA +KL +AV EV+KLLVP AEG 
Sbjct: 126 KGSMEIDKKKEEQNRGKPNWEHLNEDLHVLITV^DAQNRAEI 185 

Query: 241 EDELKKRQLMELAIINGTYRDTTAKSVAVCDEEWRRLVAASDSRLLTSTGLPGLAAQIRA 300 

ED LKK QLMELAI +NGTYRD KS A+ A + R++T A +R 

Sbjct: 186 EDSLKKMQLMELAILNGTYRDANIKS PALAF S - - LAATAQAAPRI ITGPAPVLPPAALRT 243 
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Query: 301 PA- AAPLGAPLILNPRMTVPTTAASILSAQAAPTAAFDQTG- -HGMI FAPYDY 350 

PAP PLI + V + + PTAA G G+I+ PY+Y 

Sbjct: 244 PTPAGPTIMPLIRQIQTAV MPNGT PHPTAAIVPPGPEAGL I YTPYEY 290 

dbj |BAB69499.1| (AB067801) RNA binding protein HQK-7B [Homo sapiens] 
Length = 319 

Score = 280 bits (716), Expect = 2e-74 

Identities = 156/293 (53%), Positives = 198/293 (67%), Gaps = 17/293 (5%) 

Query: 67 QQQQSTQS I ADYLAQLLKDRKQLAAFPN VFTHVERLLDEEI ARVRASLFQ — ING-V 120 

+ ++ + DYLi QL+ D+K +++ PN +F H+ERLLDEEI+RVR ++ +NG 
Sbjct: 6 ETKEKPKPTPDYLMQLMNDKKLMS SLPNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGST 65 

Query: 121 KKEPLTLPEPEGSWTMNEKVWPWEHPDFNFVGRILGPRGMTAKQLEQETC 180 

+ K LP+ G +V + EK+ YVFV+ E+ PDFNFVGRI LGPRG+TAKQLE ETGCKIMVRG 
Sbjct: 66 EKRSAELPDAVGPIVQLQEKLWPVKEYPDFNFVGRILGPRGLTAKQLEAETGCKIMVRG 125 

Query: 181 KG SMRDKKKEDANRGKPNWEHL SDDLHVL I TVEDTENRATVKLAQ AVAEVQKLLVP Q AEG 240 

KGSMRDKKKE+ NRGKPNWEHL++DLHVLITVED +NRA +KL +AV EV+KLLVP AEG 
Sbjct: 126 KGSMRDKKKEEQNRGKPNWEHLNEDLHVL I TVED AQNRAE I KLKRAVEEVKKLLVP AAEG 185 

Query: 241 EDELKKRQLMELAI INGTYRDTT AKS VAVCDEEWRRLVAASDSRLLT STGLPGLAAQIRA 300 

ED LKK QLMELAI+NGTYRD KS A+ A + R++T A +R 

Sbjct: 186 EDSLKKMQLMELAILNGTYRDANIKSPALAFS - - LAATAQAAPRIITGPAPVLPPAALRT 243 

Query: 301 PA-AAPLGAPLILNPRMTVPTTAASILSAQAAPTAAFDQTG — HGMI FAPYDY 350 

PAP PLI + V + + PTAA G G+I+ PY+Y 

Sbjct: 244 PT PAGPT IMPL IRQI QTAV MPNGT PHPTAAIVPPGPEAGL I YTPYEY 290 

dbj |BAB69498.l| (AB067800) RNA binding protein HQK-7 [Homo sapiens] 
Length = 325 

Score - 280 bits (716), Expect = 2e-74 

Identities = 156/293 (53%), Positives = 198/293 (67%), Gaps = 17/293 (5%) 

Query: 67 QQQQSTQSIADYLAQLLKDRKQLAAFPN VFTHVERLLDEEIARVRASLFQ — ING-V 120 

+ ++ + DYL QL+ D+K +++ PN +F H+ERLLDEEI+RVR ++ +NG 
Sbjct: 6 ETKEKPKPTPDYLMQLMNDKKLMSSLPNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGST 65 

Query: 121 KKEPLTLPEPEGSWTMNEKVYVPVREHPDFNFVGRILGPRGMTAKQLEQETGCKIMVRG 180 

+K LP+ G +V + EK+ YVPV+ E+ PDFNFVGRILGPRG+TAKQLE ETGCKIMVRG 
Sbjct: 66 EKRSAELPDAVGPIVQLQEKLYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGCKIMVRG 125 

Query: 181 KGSMRDKKKEDANRGKPNWEHLSDDLHVLITVEDT 240 

KGSMRDKKKE+ NRGKPNWEHL+ +DLHVLITVED +NRA +KL +AV EV+KLLVP AEG 
Sbjct: 126 KGSMRDKKKEEQNRGKPNWEHLNEDLHVLITVEDAQNRAEIKLKRA^ 185 

Query: 241 EDELKKRQLMEL AI INGTYRDTTAKS VAVCDEEWRRLVAASDSRLLTSTGL PGLAAQIRA 300 

ED LKK QLMELAI+NGTYRD KS A+ A + R++T A +R 

Sbjct: 186 EDSLKKMQLMELAILNGTYRDANI KS PALAF S — L AATAQAAPRI ITGPAPVLPPAALRT 243 

Query: 301 PA- AAPLGAPLILNPRMTVPTTAAS I LS AQAAPTAAFDQTG — HGMI FAPYDY 350 

p A P PLI + V + + PTAA G G+I+ PY+Y 

Sbjct: 244 PTPAGPTIMPLIRQIQTAV MPNGT PHPTAAIVPPGPEAGLI YTPYEY 290 
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gb|AAF63412.l|AFl42417_l (AF142417) QUAKING isoform 1 [Homo sapiens] 
Length = 321 

Score « 280 bits (716), Expect = 2e-74 « m „ M 

Identities = 156/293 (53%), Positives = 198/293 (67%), Gaps = 17/293 (5%) 

Query: 67 QQQQSTQSIADYLAQLLKDRKQLAAFPN VFTHVERLLDEEIARVRASLFQ ING-V 120 

+ ++ + DYL QL+ D+K + + + PN +F H+ERLLDEEI+RVR ++ +NG 
Sbjct: 2 ETKEKPKPT PDYLMQLMNDKKLMSSLPNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGST 61 

Q uer y. 121 KKEPLTLPEPEGSVVTMNEKVYVPVREHPDFNFVGRILGPRGMTAK 180 

+K LP+ G +V + EK+ YVPV+ E+ PDFNFVGRI LGPRG+ TAKQLE ETGCKIMVRG 
Sbjct: 62 EKRSAELPDAVGPIVQLQEKLYVPVKEYPDFNFVGRIBGPRGLTAKQLEAETGCKIMVRG 121 

Query: 181 KGSMRDKKKEDANRGKPNWEHL SDDLHVL I TVEDT ENRATVKLAQAVAEVQKLLVP Q AEG 240 

KGSMRDKKKE+ NRGKPNWEHL+ +DLHVLITVED +NRA +KL +AV EV+KLLVP AEG 
Sbjct: 122 KGSlylRDKKKEEQNRGKPISMEHLNEDLHVLITVED 181 

Query: 241 EDELKKRQLMELAI INGT YRDTTAKS VAVCDEEWRRLVAASDSRLLT STGL PGLAAQIRA 300 

ED LKK QLMELAI +NGTYRD KS A+ A + R++T A +R 

Sbjct: 182 EDSLKKMQLMELAILNGTYRDANIKS PALAFS — LAATAQAAPRI ITGPAPVLPPAALRT 239 

Query: 301 PA-AAPI1GAPLILNPRMTVPTTAASILSAQAAPTAAFDQTG--HGMIFAPYDY 350 

p A p PLI + V + + PTAA G G+I+ PY+Y 

Sbjct: 240 PTPAGPTIMPLIRQIQTAV MPNGTPHPTAAIVPPGPEAGLIYTPYEY 286 



dbj |BD004960.1| Genes related to stomach cancer, Length = 1993 
Score = 288 bits (738) , Expect = le-77 

Identities = 168/324 (51%), Positives = 211/324 (64%), Gaps = 11/324 (3%) 
Frame = +1 

Query: 77 DYLAQLLKDRKQLAAFPN VFTHVERLLDEEIARVRASLFQ — ING-VKKEPLTLPEP 130 

DYL QL+ D+K ++ + PN +F H+ERLLDEEI +RVR ++ +NG +K LP+ 
Sbjct: 4 DYLMQLMNDKKLMS SLPNFCGI FNHLERLLDEEI SRVRKDMYNDTLNGSTEKRSAELPDA 183 

Query: 131 EG SVWMNEKVYVPVREH PDFNFVGRI LGPRGMTAKQLEQETGCKIM 190 

G +V + EK+ YVPV+ E+ PDFNFVGRI LGPRG+ TAKQLE ETGCKIMVRGKGSMRDKKKE 
Sbjct: 184 VGPIVQLQEKLYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGCKIMVRGKGSMRDKKKE 363 

Query: 191 DANRGKPNWEHLSDDLHVLITVEDTENRATVKLAQAVAEVQKLLWQAEGEDELKKRQLM 250 

+ NRGKPNWEHL+ +DLHVLITVED +NRA +KL +AV EV+KLLVP AEGED LKK QLM 
Sbjct: 364 EQNRGKPNWEHLNEDLHVLITVEDAQNRAEIKLKRAVEEVKKLLVPAAEGEDSL 543 

Query: 251 EL AI INGT YRDTTAKSVAVCDEEWRRLVAASDSRLLT STGLPGLAAQI RAPA- AAPLGAP 3 09 

ELAI+NGTYRD KS A+ A + R++T A +R P A P P 

Sbjct: 544 ELAILNGTYRDANIKSPALAFS — LAATAQAAPRI I TGPAPVLPPAALRTPTPAGPTIMP 717 

Query: 310 LILNPRMTVPTTAASILSAQAAPTAAFDQTG — HGMIFAPYDYANYAALAGNPLLTEYAD 367 

LI + V + + PTAA G G+I+ PY+Y Y +L + 

Sbjct: 718 LIRQIQTAV MPNGTPHPTAAIVP PGPEAGLI YT PYEYP- YTL APATS ILEYPI E 876 

Query: 368 HS — VGAI KQQRRL ATNREHPYQR 389 

S +GA+ + R R HPYQR 
Sbjct: 877 PSGVLGAVATKVRRHDMRVHPYQR 
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Figure 11B. Multiple Sequence Alignment (ClustalW 1.83) 

CG10293 Dm MSVCESKAWQQQLQQHLQQQAAAAWAVAQQQQAQAQAQAQAQAQQQQQAPQVVVPMTP 

QKI-6 Hs MLSLSSLRRNSGRNSGSCGAWN 

QKI-2 Hs MLSLSSLRRNSGRNSGSCGAWN 

QKI-3 Hs MLSLSSLRRNSGRNSGSCGAWN 

HQK-7B Hs 



CG10293 Dm QHLTPQQQQQSTQSIADYLAQLLKDRKQLAAFPN VFTHVERLLDEEIARVRASLF — 

QKI-6 Hs -MVGEMETKEKPKPT PDYLMQLMNDKKLMS SLPNFCGI FNHLERLLDEEI SRVRKDMYND 
QKI-2 HS -MVGEMETKEKPKPTPDYLMQLMNDKKLMSSLPNFCGIFNHK 

QKI-3 Hs -MVGEMETKEKPKPT PDYLMQLMNDKKLMS SLPNFCGI FNHLERLLDEEI SRVRKDMYND 
HQK-7B HS -MVGEMETKEKPKPTPDYLMQLMDTOKKLMSSLPNFCGIFNHLER^ 

CG10293 Dm QING-VKKEPIiTLPEPEGSVVTMNEKVWPVREHPDFNFVGRILGPRGMTAKQLEQETGC 
QKI-6 Hs TLNGSTEKRSAEIiPDAVGPIVQLQEKIiYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGC 
QKI-2 Hs TLNGSTEKRSAELPDAVGPIVQLQEKLYVPVKEYPDFNFVGRILGPRGLTAKQLEAETGC 
QKI-3 Hs TLNGSTEKRSAEIiPDAVGPIVQIiQEKLYVPVKEYPDFNFVGRILGPRGIiTAKQLEAETGC 
HQK-7B Hs TLNGSTEKRSAEL PDAVGP IVQLQEKLYVFVKEYPDFNFVGRI LGPRGLTAKQLEAETGC 

CG10293 Dm KIMVRGKG SMRDKKKEDANRGKPNWEHL SDDLHVL I TVEDT ENRATVKLAQAVAEVQKLL 
QKI-6 Hs KIMVRGKGSMRDKKKEEQimGKPNWEHLNE^ 
QKI-2 Hs KIMVRGKGSMRDKKKEEQNRGKPNWEHLNEDLHV^ 

QKI-3 Hs KIMVRGKGSMRDKKKEEQNRGKPNWEHLNEDLHVIjITVEDAQNRAEIKLKRAVEE\7ICKDL 
HQK-7B Hs KIMVRGKGSMRDKKKEEQNRGKPNWEHLNEDLHVLITVEDAQNRAEIK 

CG10293 Dm VPQAEGEDELKKRQLMEL AI INGTYRDTTAKSVAVCDEEWRRLVAASDSRLLTSTGL PGL 
QKI-6 Hs VPAAEGED SLKKMQLMELAILNGTYRDANIKS PALAFSLAATAQAAP — RIITGPAPVIiP 
QKI-2 Hs VPAAEGEDSLKKMQLMELAILNGTYRD ANIKS PALAFSLAATAQAAP — RI ITGPAPVL P 
QKI-3 HS VPAAEGEDSLKKMQLMELAILNGTYRDANIKSPALAFSLAATAQAAP — RIITGPAPVLP 
HQK-7B Hs VPAAEGEDSLKKMQLMELAILNGTYRD ANIKS PALAFSLAATAQAAP — RIITGPAPVLP 

CG10293 Dm AAQIRAP-AAAPLGAPLILNPRMTVPTTAASILSAQAAPTAAFDQTG — HGMIFAPYDYA 

QKI-6 Hs PAALRTPTPAGPTIMPLIR QI QT AVMPNGT PHPTAAI VPPGPEAGL I YTPYEYP 

QKI-2 Hs PAALRTPTPAGPTIMPLIR QIQTAVMPNGTPHPTAAI VPPGPEAGL I YTPYEYP 

QKI-3 Hs PAALRTPTPAGPTIMPLIR QIQT AVMPNGT PHPTAAIVPPGPEAGLI YTPYEYP 

HQK-7B Hs PAALRTPTPAGPTIMPLIR QI QTAVMPNGTPHPTAAIVPPGPEAGLI YTPYEYP 

CG10293 Dm NYAALAGNPLLTEYADHS VGAI KQQRRLATNREHPYQRATVGVPAKPAGF I EI Q 
QKI-6 Hs — YTLAP ATSILEYP IEPSGVLGAVATKVRRHDMRVH PYQRXVTADRAATGN — 

QKI-2 Hs — YTLAP ATS ILEYPIEPSGVLEWIEMPVMP-DIS AH 

QKI-3 Hs — YTLAP ATSILEYP I EPSGVLGMAFPTKG 

HQK-7B Hs — YTLAPATSILEYPIEPSGVLGKFFSPWG 
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Figure 15. Homology of Drosopila GadFly Accession Number CG9373 to human 
KIAA1443 protein, human unnamed protein product, and human myelin gene 
expression factor 2 

Figure ISA. BLASTP results for GadFly Accession Number CG9373 
Homology to human protein BAA92579.1 (GenBank Accession Number) 

dbj |BAA92579.1| (AB037762) KIAA1341 protein [Homo sapiens]. Length = 620 
Score = 249 bits (635), Expect = le-64 

Identities = 207/660 (31%), Positives = 295/660 (44%), Gaps = 148/660 (22%) 

Query: 1 MSMDASNSVESREKERDRRGRGAR-GSRFTDADGNGN-GAGSQGGGVAARDRSRERRNCR 58 

+ M+ S + + + + G++ +RF +NGG + G RNR 
Sbjct: 72 VKMENDESAKEEKSDLKEKSTGSKKANRFHPYSKDKNSGTGEKKG PNRN-R 121 

Query: 59 VYI SNI PYDYRWQDLKDLFRRIVGS IEYVQLFFDESGKARGCGIVEFKDPENVQKALEKM 118 

V+ISNIPYD +WQ +KDL R VG + YV+LF D GK+RGCG+VEFKD E V+KALE M 
Sbjct: 122 VFI SNI PYDMKWQAIKDLMREKVGEVTYVELFKDAEGKSRGCGVVEFKI3EEFVKKALETM 181 

Query: 119 mYEVNGRELVVKEDHGEQRDQYGRIVRDGGGGGGGGGGVQGGNGGimGGGGGGGRDHMD 178 

N+Y+++GR Ii +KED + + + R GG GG H+ 
Sbjct: 182 NKYDLSGRPLNIKEDPDGENARRA-LQRTGGSFPGG HVP 219 

Query: 179 DRDRGFSRRDDDRLSGRNNFNMMSNDYNNS SNYNLYGLSASFIjESLGI SGPLHNKVFVAN 238 

D G L NN N+ +N +G L + +FVAN 

Sbjct: 220 DMGSGLMNLPPSIL NNPNIPPEVISNLQ AGRLGSTI FVAN 259 

Query: 239 LDYKVDNKKLKQVFKIjAGKVQSVDIjSIiDKEGNSRGF AVI EYDHPVEAVQAI SMLDRQMLF 298 

LD+KV KKLK+VF +AG V+ D+ DK+G SRG + ++ +EAVQAISM + Q LF 
Sbjct: 260 LDFKVGWKKLKEVF S I AGTVKRADIKEDKDGKSRGMGTVTFEQAIEAVQAI SMFNGQFLF 319 

Query: 299 DRRMTVRLD— RIPDK NEGIKL PEGLGGVGI GLGPNGEPLRDVAHNLiPNGGQS Q 350 

DR M V++D +P + + +LP GliGG+G+GLGP G+P+ N+ 
Sbjct: 320 DRPMHVKMDDKSVPHEEYRSHDGKTPQL PRGLGGIGMGLGPGGQPI S ASQIiNI 372 

Query: 351 GQLLGNAQQGSQLGSVGS QPNS S AVSNATTNLLNNLTGVMFGNHAAVQPS PVAFVQKPS L 410 

G ++GN G + G FG + 
Sbjct: 373 GGVMGNLGPGGM GMDGPGFGG MNRI 397 

Query: 411 GNNTGSGGLNLNNLNP S I LAAWGNLGNQG — GNLSNPLLSSSL SNLGLNLGNS 462 

G G GGL N +G G G G L ++SS+ ++G+N G 

Sbjct: 398 GGGI GFGGLEAMN SMGGFGGVGRMGELYRGAMTSSMERDFGRGDIGINRGFG 449 

Query: 463 GNDDNLP PSNVGLSNNYS SGGTGGGNS YS S GNNYSGGGGS SN LGYNAYSSS-G 514 

+ L + +G +G G N G+ SGG GS N +G + SSS 

Sbjct: 450 DSFGRLGSAMIG GF AGRI GS SNMGPVGSGI SGGMGSMNSVTGGMGMGLDRMS S SFD 505 

Query: 515 GMGGGNGGVGTOGNDYNTGNPLDVYGGGSNVGNSNVGSANAVGASRKSDTI I IKNVP ITC 574 

MGGG+ D + G GG +GS K + I ++N+P 

Sbjct: 506 RMGPGI GAI LERS IDMDRGFLSGPMGSGM RERIGS -KGNQIFVRNLPFDL 554 

Query: 575 TWQTLRDKFREIGDVKFAEI RGNDVGVVRFFKERDAELAIALMDGSRLDGRNIKV 629 

TWQ L++KF + G V FAEI + G VRF AE A +M+G ++ GR I V 

Sbjct: 555 TWQKLKEKFSQCGHVMFAEIKMENGKSKGCGTVRFDS PESAEKACRIMNGIKI SGREIDV 614 



WO 03/092715 PCT/EP03/04650 

30/51 



Score = 68.6 bits (166), Expect = 2e-10 

Identities = 41/114 (35%), Positives = 67/114 (57%), Gaps = 6/114 (5%) 

Query: 20 GRG7UIGSRFTDADGNGNGAGSQGGGVAARDRSRERRNCRVYISNIPYDYRWQDLKDLFRR 79 

G GA R D D G +G G G+ R+R + N ++ + + N+P+D WQ LK+ F + 
Sbjct: 510 GIGAILERS IDMD- RGFLSGPMGSGM- -RERIGSKGN-QIFVRNLPFDLTWQKLKEKFSQ 565 

Query: 80 IVGSIEWQLFFDESGKARGCGr\nSFKDPEWQKALEKMNRYE\7NGRELVVKED 133 
~ ^ G + + ++ E+GK++GCG V F PE+ +KA MN +++GRE+ V+ D 

Sbjct: 566 C - GHVMFAE I KM- ENGKSKGCGTVRFDS PESAEKACRIMNGIKI SGREIDVRLD 617 

Score = 56.2 bits (134), Expect = le-06 

Identities =- 46/180 (25%), Positives = 76/180 (41%), Gaps = 21/180 (11%) 

Query: 139 DQYGRIVRDGGGGGGG GGGVQGGNGGNNGGGGGGGRDHMDDRDRGF SRRD 188 

D +GR+ GG G G G+ GG G N GG G +D F R 

Sbjct: 450 DSFGRLGSAMIGGFAGRI GSSNMGPVGSGISGGMGSMNSVTGGMGMG-IiDRMSS SFDRM- 507 

Query: 189 DDRL SGRNNFNMMSNDYNNS SNYNIiYGL S ASFLES LGI SGPLHNKVFVANLDYKVDNKKIj 248 

q ++ + + ++ E +G G N++FV Nit + + +KL 
Sbjct: 508 GPGIGAIIiERSIDMDRGFLSGPMGSGMRERIGSKG NQIFVRNIjPFDLTWQKIj 559 

Query: 249 KQWKiAGKVQSVDLSLDKEGNSRGFAVIEYDHPVEAV^^ 308 

K+ F G V + + ++ G S+G + +D P A +A +++ + R + VRLDR 
Sbjct: 560 KEKF SQCGHVMFAEIKMEN- GKSKGCGTVRFDS PESAEKACRIMNGIKI SGREIDVRLDR 618 

Homology to human protein BAB14421.1 (GenBank Accession Number) 

>dbj | BAB14421.1) (AK023133) unnamed protein product [Homo sapiens] , 
Length =576 

Score = 242 bits (618), Expect = le-62 

Identities = 206/654 (31%), Positives = 289/654 (43%), Gaps = 160/654 (24%) 

Query: 1 MSMDASNSVESREKERDRRGRGAR-GSRFTDADGNGN-GAGSQGGGVAARDRSRERRNCR 58 

+ M+ S+++ + G++ +RF +NGG+G RNR 
Sb j c t : 52 VKMENDESAKEEKSDLKEKSTGSKKANRFHPYSKDKNSGTGEKKG PNRN-R 101 

Query: 59 VYI SNI PYDYRWQDLKDLFRRIVGS I E YVQLFFDESGKARGCGIVEFKDPENVQKALEKM 118 

V+ISNIPYD +WQ +KDL R VG + YV+LF D GK+RGCG+VEFKD E V+KALE M 
Sbjct: 102 VFI SNI PYDMKWQAIKDLMREKVGEVTYVELFKDAEGKS 161 

Query: 119 NRYE\7NGRELVVKED-HGEQRDQYGRIVRDGGGGGGGGGGVQGGNGG1^GGGGGGGRI3HM 177 

N+Y+++GR L +KED GE + + R GG GG H+ 
Sbjct: 162 NKYDLSGRPLNIKEDPDGENARRASQ — RTGGSFPGG HV 198 

Query: 178 DDRDRGFSRRDDDRL SGRNNFNMMSNDYNNSSNYNLYGLS ASFIiESLGI SGPLHNKVFVA 237 

D G L NN N+ +N +G L + +FVA 

Sbjct: 199 PDMGSGLMNLPPSIL NNPNIPPEVISNLQ AGRIX3STI FVA 238 

Query: 238 NLDYKVDNKKLKQVFKLAGKVQSTO^ 297 

NLD+KV KKLK+VF +AG V+ D+ DK+G SRG + ++ +EAVQAISM + Q L 
Sbjct: 239 NIjDFKVGWKKLKEVFS I AGTVKRADIKEDKDGKSRGMGTVTFEQAIE AVQAI SMFNGQFL 298 



Query: 298 FDRRMTVRLD — RI PDK NEGIKLPEGLGGVGIGLGPNGEPLRDVAHNLPNGGQS 349 
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FDR M V++D +P + + +LP GLGG+G+GLGP G+P+ N+ 
Sbjct: 299 FDRPMHVKMDDKS VPHEEYRSHDGKTPQLPRGLGGI GMGLGPGGQPI S ASQLNI 352 

Query: 350 QGQLLGNAQQGSQLGSVGSQPNS S AVSNATTNLLNI^TGVMFGNHAAVQPS PVAPVQKPS 409 

G ++GN G + G FG 
Sbjct: 353 -GGVMGNLGPGGM GMDGPGFGG MNR 376 

Query: 410 LGNNTGSGGLNLNNIiNPSILAAWGNLGNQG — GNLSNPLLSSSL SNLGLNLGN 461 

+G G GGL N +G G G G L ++SS+ ++G+N G 

Sbjct: 377 IGGGIGFGGLEAMN SMGGFGGVGRMGELYRGAMTSSMERDFGRGDIGINRG- 427 

Query: 462 SGNDDNLPPSNVGL SNNYS SGGTGGGNS YS SGNNYSGGGGS SNLGYNAYS S S -GGMGGGN 520 

G S GG GG NS + G +G + SSS MG G 

Sbjct: 428 FGDSFGRLGGGMGGMNSVT GGMGMGLDRMS S S FDRMGPG I 467 

Query: 521 GGVGVDGNDYNTGNPLDVYGGGS WGNSNVGSANAVGASRKSDT 1 1 IKNVPITCTWQTLR 580 

G+ D + G GG +GS K + I ++N+P TWQ L+ 

Sbjct: 468 GAI LERS IDMDRGFL SGPMGSGM RERIGS KGNQ I FVRNL P FDLTWQKLK 516 

Query: 581 DKFREIGDVKFAEI RGNDVGVVRFFKERDAELAIALMDGSRLDGRNIKV 629 

+KF + G V FAEI + G VRF AE A +M+G ++ GR I V 

Sbjct: 517 EKFSQCGHVMFAEIKMENGKSKGCGTVRFDSPESAEKACRIMNGIKISGREIDV 570 

Score = 72.8 bits (177), Expect = le-11 

Identities = 82/348 (23%), Positives = 133/348 (37%), Gaps = 96/348 (27%) 

Query: 54 RRNCRVYI SNI PYDYRWQDLKDLFRRIVGS IEWQLFFDESGKARGCGIVEFKDPENVQK 113 

R + w+ LK++F I G+ + + + D+ GK+RG G V F+ + 

Sbjct: 230 RLGSTI FVANLDFKVGWKKLKEVFS - I AGTVKRADIKEDKDGKSRGMGTVTFEQAI EAVQ 288 

Query: 114 ALEKMNRYEVNGRELVVKED HGEQRDQYGRIVRDGGGGGGGG- 155 

A+ N + R + VK D HER G+ + G GG G 

Sbjct: 289 AI SMFNGQFLFDRPMHVKMDDKSVPHEEYRSHDGKTPQLPRGLGGI GMGLGPGGQPI SAS 348 

Query: 156 GGVQG GNGGNNGGGGG GGRDHMDDRDRGF 184 

GGV G G GG N GGG GG M + RG 

Sbjct: 349 QLNIGGVMGNLGPGGMGMDGPGFGGMNRIGGGIGFGGLEAMNSMGGFGGVGRMGELYRGA 408 

Query: 185 SRRDDDRL S GRNNFNMMS NDYNNS SNYNL YGL SAS FLESLG 225 

+R GR + + N L + S+SF + +G 

Sbjct: 409 MTS SMERDFGRGDI GINRGFGDS FGRLGGGMGGMNSVTGGMGMGLDRMS S SF -DRMGPGI 467 

Query: 226 ISGPLH NKVWANLDYKVDNKKLKQWKLAGKVQS 260 

+SGP+ N++FV NL + + +KLK+ F G V 

Sbjct: 468 GAI LERS IDMDRGFL SGPMGSGMRERIGSKGNQIFVRNL PFDLTWQKLKEKF SQCGHVMF 527 

Query: 261 VDLSLDKEGNSRGFAVIEYDHPVEAVQAISMLDRQMLFDRRMTVRLDR 308 

+ + ++ G S+G + +D P A +A +++ + R + VRLDR 
Sbjct: 528 AEIKMEN- GKSKGCGTVRFDS PES AEKACRIMNGIKI S GREI DVRLDR 574 

Score = 68.6 bits (166), Expect = 2e-10 

Identities = 41/114 (35%), Positives = 67/114 (57%), Gaps = 6/114 (5%) 

Query: 20 GRGARGSRFTDADGNGNGAGS QGGGVAARDRSRERRNCRVYI SNI PYDYRWQDLKDLFRR 79 

G GA R D D G +G G G+ R+R + N ++++ N+P+D WQ LK+ F + 
Sbjct: 466 GI GAILERS IDMD - RGFL SGPMGSGM — RERIGSKGN- QI FVRNL PFDLTWQKLKEKF SQ 521 



Query: 80 IVGSIEYVQLFFDESGKARGCGIVEFKDPEWQKALEKMl^YEVNGRELVVKED 133 
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G + + ++ E+GK++GCG V F PE+ +KA MN +++GRE+ V+ D 
Sbjct: 522 C- GHVMFAEIKM- ENGKSKGCGTVRFDS PES AEKACRIMNGIKI SGREIDVRLD 573 

Homology to human protein NP057216.1 (GenBank Accession Number) 

ref |NP_057216.l| (NM_016132) myelin gene expression factor 2 [Homo sapiens] 
gb|AAD43038.l| (AF106685) myelin gene expression factor 2 [Homo sapiens] 
Length = 547 

Score = 238 bits (607), Expect = 2e-61 

Identities = 204/659 (30%), Positives = 295/659 (43%), Gaps = 150/659 (22%) 

Query: 3 MDASNSVE SREKERDRRGRGAR- GSRFTDADGNGN- GAGS QGGGVAARDRS RERRNCRVY 60 

M+ S + + + + G++ +RF + N G G + G RN RV+ 

Sbjct: 1 MENDESAKEEKSDLKEKSTGSKKANRFHPYSKDKNSGTGEKKG PNRN-RVF 50 

Query: 61 I SNI PYDYRWQDLKDLFRRIVGS IEVVQLFFDESGKARGCGIVEFKDPENVQKALEKMNR 120 

I SNIP YD +WQ +KDL R VG + YV+LF D GK+RGCG+VEFKD E V+KALE MN+ 
Sbjct: 51 I SNI PYDMKWQAI KDLI^EKVGEVTYVEIjFKDAEGKSRGCGVVEFKDEEFVKKALETMNK 110 

Query: 121 YEVNGRELWKEDHGEQRDQYGRIVRDGGGGGGGGGGVQGGNGGNN^ 180 

Y+++GR + +KED + + +RG QG++GG 
Sbjct: 111 YDLSGRRVNIKEDPDGENARRA-LQRTGTS FQGSHASDVGSG 151 

Query: 181 DRGFSRRDDDRLSGRNNFNMMSNDYl^SSNYNLYGLSASFLESLGISGPLHNKVFVANLD 240 

N+ + NN + + + +L +G It + +FVANLD 

Sbjct: 152 LVNLPPSIIiNNPN IPPEVISNLQ-AGRLGSTIFVANLD 188 

Query: 241 YKVDNKKLKQVFKL AGKVQSVDLSLDKEGNSRGFAVT EYDHPVEAVQ AI SMLDRQMLFDR 300 

+KV KKLK+VF +AG V++ DK+G SRG + ++ +EAVQAISM + Q LFDR 

Sbjct: 189 FKVGWKKLKEVFS I AGTVKAGS YKEDKDGKSRGMGTVTFEQAI EAVQAI SMFNGQFLFDR 248 

Query: 301 RMTVRLD RIPDKNEGIKLPEGLGGVGIGLGPNGEPLRDVAHNLPNGGQSQG 351 

M V++D R PD + +LP GLGG+G+GLGP G+P+ N+ G 
Sbjct: 249 PMHVKMDDKSVPHEEYRSPD-GKTPQLPRGLGGIGMGLGPGGQPISASQIiNI G 300 

Query: 352 QLLGNAQQGS QLGS VGS QPNS S AVSNATTNLLNNLTGVMFGNH^^ SLG 411 

++GN G + G FG +G 
Sbjct: 301 GVMGNIiGPGGM GMDGPGFGG MNRIG 325 

Query: 412 NOTGSGGLNLNNLNPSILAAWGNLGNQG — GNLSNPLLSSSLS NLGLNLGNSG 463 

G GGL N +G G G G L ++SS+ ++GL+ G 

Sbjct: 326 GGIGFGGLEAMN SMGGFGGVGRMGELYRGAMTSSMERDFGHRDIGLSRGFGD 377 

Query: 464 NDDNLPPSNVGLSNNYSSGGTGGGNSYSSGNNYSGGGGSSN :-LGYNAYSSS-GG 515 

+ li + +G +G G N G+ SGG GS N +G + SSS 

Sbjct: 378 SFGRLGSAMIG GITGRIGS SNMGPVGSGI SGGMGSMNS VTGGMGMGLDRMS S SFDR 433 

Query: 516 MGGGNGGVGVDG^YNTGNPLDVYGGGSNVGNSNVGSANAVGASRKSDTI^ 575 

MGGG+ D + G GG +GS K + I ++N+P T 

Sbjct: 434 MGPGIGAILERSIDMDRGFLSGPMGSGM RERIGS KGNQIFVRNLPFDLT 482 

Query: 576 WQTLRDKFREIGDVKFAEI RGNDVGVVRFFKERDAELAIAIiMDGSRLDGRNIKV 629 

WQ L++KF + G V FAEI + G VRF AE A +M+G ++ GR I V 

Sbjct: 483 WQKIjKEKFS QCGHVMFAEIKMENGKSKGCGTVRFDS PES AEKACRIMNGIKI SGREIDV 541 
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Score = 68*6 bits (166), Expect = 2e-10 

Identities = 41/114 (35%), Positives = 67/114 (57%), Gaps = 6/114 (5%) 

Query: 20 GRGARGSRFTDADGMGNGAGSQGGGVAARDRSRERRNCRVYISNIPYDYRWQDLKDLFRR 79 

G GA R D D G +G G G+ R+R + N ++++ N+P+D WQ LK+ F + 
Sbjct: 437 GIGAILERS IDMD- RGFLSGPMGSGM — RERIGSKGN-QIFVRNLPFDLTWQKLKEKFSQ 492 

Query: 80 IVGSIEYVQLFFDESGKARGCGIVEFKDPENVQKALEKMNRYEVNGRELVVKED 133 

G + + ++ E+GK++GCG V F PE+ +KA MN +++GRE+ V+ D 
Sbjct: 493 C - GHVMFAEIKM- ENGKSKGCGTVRFDS PES AEKACRIMNGIKI SGREIDVRLD 544 



Score = 55.5 bits (132) , Expect = 2e-06 

Identities = 41/157 (26%), Positives = 69/157 (43%), Gaps = 11/157 (7%) 

Query: 152 G^GGGGVQGGNGGNNGGGGGGGRDHMDDRDRGFSRRDDDRLSGIU^ 211 

GGG+GGGNGGG+D FR G ++ + + 

Sbjct: 400 GPVGSGI SGGMGSMNSVTGGMGMG- LDRMS S S FDRM GPGIGAILERSIDMDRGF 452 

Query: 212 NLYGLSASFLESLGISGPLHNKVFVANLDYKVDNKKLKQWKLAG^^ 271 

+ + E +G G N++FV NL + + +KLK+ F G V ++ ++ G S 
Sbjct: 453 LSGPMGSGMRERIGSKG NQIFVRNLPFDLTWQKLKEKFSQCGHVMFAEIKMEN-GKS 508 

Query: 272 RGFAVI EYDHPVEAVQAI SMLDRQMLFDRRMTVRLDR 308 

+G + +D P A +A +++ + R + VRLDR 
Sbjct: 509 KGCGTVRFDS PES AEKACRIMNGIKI SGREIDVRLDR 545 



Figure 15B. Multiple Sequence Alignment (ClustalW 1.83) 

CG9373 Dm 

KIAA1341 Hs PLSRSEPL S SGGRGGGSGGGMADANKAEVPGATGGDS PHLQPAEPPGEPRREPHPAEAEK 

MyEF-2 Hs 

FLJ13071 Hs MADANKAEVPGATGGD SPHLQPAEPPGEPRREPH PAEAEK 

CG9373 Dm MSMDASNSVESREKERDRRGRGARGSRFTDADGNGNGAGSQGGGVAAJRDRSRERRNC 

KIAA1341 HS QQPQHS S S SNGVKMENDES AKEEKSDLKEKSTGSKKANRFH PYSKDKNSGTGEKKGPNRN 

MyEF-2 Hs MENDESAKEEKSDLKEKSTGSKKANRFHPYSKDKNSGTGEKKGPNRN 

FLJ13071 Hs QQPQHSSSSNGVKMENDESAKEEKSDLKEKSTGSKKANRFHPYSK^ 

CG9373 Dm RVYI SNI PYDYRWQDLKDLFRRIVGS IEYVQLFFDESGKARGCGIVEFKDPENVQKALEK 
KIAA1341 Hs RVFI SNI PYDMKWQAIKDLMREKVGEVTYVELFKDAEGKSRGCGVVEFKDEEFVKKAIiET 
MyEF- 2 Hs RVFI SNI PYDMKWQAIKDLMREKVGEVTYVELFKDAEGKSRGCGVVEFKDEEFVKKALET 
FLJ13071 HS RVFI SNI PYDMKWQAIKDLMREKVGEVTYVELFKDAEGKS RGCGWEFKDEEFVKKALET 

CG9373 Dm MNRYEVNGRELVVKEDHGEQRDQYGRIVRDGGGGGGGGGG^QGGNGC^ 

KIAA1341 Hs MNKYDLSGRPLNIKEDPDGENARR ALQRTGGS F PGGHVPDMGSG 

MyEF-2 Hs MNKYDLSGRRVNIKEDPDGENARR ALQRTGTSFQGSHASDVGSG 

FLJ13071 Hs MNKYDLSGRPLNIKEDPDGENARR AS QRTGGSF PGGHVPDMGSG 

CG9373 Dm DDRDRGFSRRDDDRLSGRNNFNMMSNDYNNSSNYNLYGLSASFLESLGISGPLHNKVFVA 

KIAA1341 Hs LMNLPPSILNNPNIPPEVISNLQ AGRLGST I FVA 

MyEF-2 Hs LVNLPPSILNNPNIPPEVISNLQ AGRLGST I FVA 

FLJ13071 Hs LMNLPPSILNNPNIPPEVISNLQ AGRLGSTIFVA 
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CG9373 Dm NLDYICVDNKKLKQVFKLAGKVQSVDL SLDKEGNSRGF AVI EYDHPVEAVQAI SMLDRQML 
KIAA1341 HS NLDFKVGWKKLKEVF S I AGTVKRADI KEDKDGKSRGMGTVTFEQAIEAVQAI SMFNGQFL 
MyEF-2 HS NLDFKVGWKKLKEVF S I AGTVKAGS YKEDKDGKSRGMGTVTFEQ AI EAVQAI SMFNGQFL 
FLJ13071 HS NLDFKVGWKKLKEVF S I AGTVKRADIKEDKDGKSRGMGTVTFEQ AI EAVQAI SMFNGQFL 

CG9373 Dm FDRRMTVRLDRI PDKNEGIK LPEGLGGVGIGLGPNGEPLRDVAHNL PNGGQS 

KIAA1341 HS FDRPMHVKMDDKSVPHEEYRSHDGKT PQLPRGLGGI GMGLGPGGQ PI S ASQLNIG 

MyEF-2 HS FDRPMHVKMDDKSVPHEEYRS PDGKTPQLPRGLGGI GMGLGPGGQPI S ASQLNI G 

FLJ13071 HS FDRPMHVKMDDKSVPHEEYRSHDGKT PQLPRGLGGI GMGLGPGGQPI S AS QLNIG 

CG9373 Dm QGQLLGNAQQGSQLGSVGSQPNSSAVSNATTNLLNNLTG-VMFGNHAAVQPSPVAPVQKP 

KIAA1341 Hs GVMGNLG — PGGMGMDGPGFGGMNRIGGGIGFGGLEAMN 

MyEF-2 Hs GVMGNLG — PGGMGMDGPGFGGMNRIGGGIGFGGLEAMN 

FLJ13071 HS GVMGNLG — PGGMGMDGPGFGGMNRIGGGIGFGGLEAMN 

CG9373 Dm SLGNNTGSGGLNLNNLNPS ILAAWGNLGNQGGNLSNPLL S SSLSNLGLNLGNS GNDDNL 
KIAA1341 HS SMGGFGGVG — RMGELYRGAMTS SMERDFGRGDIGINRGFGDSFGRLGSAM- IGGF AGRI 
MyEF-2 Hs SMGGFGGVG — RMGELYRGAMTS SMERDFGHRDIGLSRGFGDSFGRLGS AM- IGGITGRI 
FLJ13071 HS SMGGFGGVG — RMGELYRGAMTSSMERDFGRGDIGINRGFGDSFGRLG 

CG9373 Dm PPSNVGLSNNYSSGGTGGGNSYSSGNNYSGGGGSSNLGTO 

KIAA1341 Hs GS SNMGPVGSGI SGGMGSMNSVTGGMGMGLDRMS S S FDR MGPGIGAILERSI 

MyEF-2 Hs GS SNMGPVGSGI SGGMGSMNSVTGGMGMGLDRMS SSFDR MGPGIGAILERSI 

FLJ13071 HS GGMGGMNSVTGGMGMGLDRMS S S FDR MGPGIGAILERSI 

CG9 37 3 Dm DYNTGNPLDVYGGGSNVGNSNVGS ANAVGASRKSDTI I IKNVPI TCTWQTLRDKFREIGD 

KIAA1341 Hs DMDRG FLSGPMGSGMRERIGSKGNQIFVRNLPFDLTWQKLKEKFSQCGH 

MyEF-2 HS DMDRG FLSGPMGSGMRERIGSKGNQIFVRNLPFDLTWQKLKEKFSQCGH 

FLJ13071 HS DMDRG FLSGPMGSGMRERIGSKGNQIFVRNLPFDLTWQKLKEKFSQCGH 

CG9373 Dm VKFAEIRGND VGWRFFKERDAELAIALMDGSRLDGRNIKVTYF 

KIAA1341 HS VMFAEIKMENGKSKGCGTVRFDSPESAEKACRIMNGIKISGREIDVRLDRNA 
MyEF-2 HS VMFAEIKMENGKSKGCGTVRFDSPESAEKACRIMNGIKISGREIDVRLDRNA 
FLJ13071 HS VMFAEIKMENGKSKGCGTVRFDS PES AEKACRIMNGIKI S GREIDVRLDRNA 
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Figure 17. Triglyceride content of a Drosophila cpo (GadFly Accession Number 
CG18434) mutant 
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Figure 19. Homology of Drosopila cpo (GadFIy Accession Number CG31243 and 
CG18434) to human RNA binding proteins with multiple splicing 

Figure 19A. Multiple Sequence Alignment (ClustalW 1.83) 



cpo Dm LVKIAlSnfQDLLGSHHQLLIAATATU^AAAAAEPQLQIiQHLLPAAPTTPAVISNPINSIGP 

NP_006858 Hs 

IPI00161102 HS 

cpo Dm INQISSSSHPSNNNQQAVFEKAITISSIAIKRRPTLPQTPASAPQVLSPSPKRQCAAAVS 

NP_006858 Hs 

IPI00161102 Hs 

cpo Dm VLPVTVPVPVPVSVPLPVSVPVPVSVKGHPISHTHQIAHTHQISHSHPISHPHHHQLSFA 

NP_0068 58 Hs 

IPI00161102 Hs 

cpo Dm HPTQFAAAVAAHHQQQQQQQAQQQQQAVQQQQQQAVQQQQVAYAVAASPQLQQQQQQQQH 

NP_006858 Hs 

IPI00161102 Hs 

cpo Dm RLAQFNQAAAAALLNQHriQQQHQAQQQQHQAQQQSDAHYGGYQLHRYAPQQQQQHIIiLSS 

NP_006858 Hs 

IPI00161102 Hs 

cpo Dm GSS SSKHNSNNNSNTS AGAAS AAVPI AT SVAAVPTTGGSLPDS PAHESHSHESNS ATAS A 

NP__006858 Hs 

IPI00161102 Hs 

cpo Dm PTTPS PAGSVTS AAPTATATAAAAGSAAATAAATGTPATSAVSDSNNNLNS SSS SNSNSN 

NP_006858 Hs MNNGGK 

IPI00161102 Hs 

cpo Dm AIMENQMALAPLGLSQSMDSVNTASNEEEVRTLFVSGLPMDAKPRELYLLFRAYEGYEGS 

NP_006858 Hs AEKENTPS EANIi QEEEVRTLFVSGLPLDIKPRELYLLFRPFKGYEGS 

IPI00161102 Hs QVRTLFVSGLPVDIKPRELYLLFRPFK 

cpo Dm LliKVTSKNGKTASPVGFVTFHTRAGAEAAKQDLQGVRFDPDMPQTIRIjEFAKSNTKVSKP 

NP_006858 Hs LIKLTSKQ PVGFVSFDSRSEAEAAKNALNGIRFDPEI PQTLRLEFAKANTKMAKN 

IPI00161102 Hs — PVGFVX FDSRAGAEAAKNAIiNGIRFDPENPQTLRLEFAKANTKMAKS 

cpo Dm KPQPNTATTASHPALMHPLTG HLGGPFFPGGPELWHHPLAYSAAAAAELPG 

NP_0068 58 Hs KLVGTPNPSTPLPNTVPQFIAREPYELTVPALYPSSPEVWAPYPLYPAELAPALPPPAFT 
IPI 0016 1102 Hs KLMAT PNP SNVHPALGAHF I ARDPYDLMGAALI PAS PEAWAPYPLYTTELTPAI SHAAFT 

cpo Dm AAALQHATLVHPALHPQVP VRSYL 

NP_006858 Hs YP ASLHAQMRWLPPSEATSQGWKSRQFC 

IPI00161102 Hs YPTATAAAAALHAQVRWYPSSDTTQQGWKYRQFC 
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FIGURE 19B. Amino acid sequence encoded by Drosophila gene CG31243 (GadFly 
Accession Number), SEQ ID NO:l 

>CG31243-PA (AE003720) [gene_syn=CG31243] [prot_desc=CG31243 gene product 
from transcript CG31243-RA] 

1 LVKIANYQDL LGSHHQLLIA ATAAAAAAAA AEPQLQLQHIi LPAAPTTPAV ISNPINSIGP 
61 INQISSSSHP SNNNQQAVFE KAITISSIAI KRRPTLPQTP ASAPQVLSPS PKRQCAAAVS 
121 VLPVTVPVPV PVSVPLPVSV PVPVSVKGHP ISHTHQIAHT HQ1SHSHPIS HPHHHQLSFA 
181 HPTQFAAAVA AHHQQQQQQQ AQQQQQAVQQ QQQQAVQQQQ VAYAVAASPQ LQQQQQQQQH 
241 RLAQFNQAAA AALLNQHLQQ QHQAQQQQHQ AQQQSLAHYG GYQLHRYAPQ QQQQHILLSS 
301 GSSSSKHNSN NNSNTSAGAA SAAVPIATSV AAVPTTGGSL PDSPAHESHS HESNSATASA 
361 PTTPSPAGSV TSAAPTATAT AAAAGSAAAT AAATGTPATS AVSDSNNNLN SSSSSNSNSN 
421 AIMENQMALA PLGLSQSMDS VNTASNEEEV RTLFVSGLPM DAKPREDYLL FRAYEGYEGS 
481 LLKVTSKNGK T AS PVGFVTF HTRAGAEAAK QDLQGVRFDP DMPQTIRLEF AKSNTKVSKP 
541 KPQPNTATTA SHPALMHPLT GHLGGPFFPG GPELWHHPLA YSAAAAAELP GAAALQHATL 
601 VHPALHPQVP VRSYL 
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Figure 21. Triglyceride content of a Drosophila Jafracl (GadFly Accession Number 
CG1633) mutant 
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Figure 23. Homology of Drosopila Jafracl (GadFIy Accession Number CG1633) to 
human peroxiredoxin 1 and human peroxiredoxin 2 (similar to peroxiredoxin 1) 

Figure 23A. BLASTP results for Jafracl 

Homology to human protein XP_009063.2 (GenBank Accession Number) 

ref |XP_009063.2 1 (XML009063) peroxiredoxin 2 [Homo sapiens] 
Length =198 

Score = 283 bits (723), Expect = 9e-76 

Identities = 134/188 (71%), Positives = 157/188 (83%) 

Query: 3 QLQKPAPAFAGTAVVNGVPKDIKLSDYKGKYLVLFFY PLDFTFVC PTEI I AFSES AAEFR 62 

++ KPAP F TAW+G FK++KLSDYKGKY+VLFFYPLDFTFVCPTEIIAFS A +FR 
Sbjct: 7 RIGKPAPDFKATAVVDGAFKEvKLSDYKGKYVVLFFY PLDFTFVCPTEI I AFSNRAEDFR 66 

Query: 63 KINC EVIGC STDS QFTHLAWINTPRKQGGLGSMDI PLLADKSMKVARDYGVLDEETGI PF 122 

K+ CEV+G S DSQFTHLAWINTPRK+GGLG ++IPLLAD + +++ DYGVL + GI + 
Sbjct: 67 KLGC EVLGV SVDSQFTHLAWINT PRKEGGLGPLNI PLLADVTRRLS EDYGVLKTDEGI AY 126 

Query: 123 RGLFI IDDKQNLRQITVNDLPVGRSVEETLRLVQAFQYTDKYGEVC PANWKPGQKTMVAD 182 

RGLFIID K LRQITVNDLPVGRSV+E LRLVQAFQYTD+ +GEVC PA WKPG T+ + 
Sbjct: 127 RGLF I IDGKGVLRQITVNDLFVGRSVDEALRLVQAFQYTDEHGEVC P AGWKPGSDT IKPN 186 

Query: 183 PTKSKEYF 190 
SKEYF 

Sbjct: 187 VDDSKEYF 194 



Homology to human protein NPJ)02565.1 (GenBank Accession Number) 

ref |NP_002565.1| (NM_002574) peroxiredoxin 1; Proliferation-associated gene 
A; 

proliferation-associated gene A (natural killer- enhancing factor A) [Homo 
sapiens] 

ref |XP_001393 .2 | (X*$_001393) peroxiredoxin 1 [Homo sapiens] 
Length = 199 

Score = 281 bits (718), Expect = 3e-75 

Identities - 135/185 (72%), Positives = 154/185 (82%), Gaps = 1/185 (0%) 

Query: 7 PAPAFAGTAW-NGVFKDIKLSDYKGKYLVLFFYPLDFTFVCPTEIIAFSESAAEFRKIN 65 

PAP F TAV+ +G FKDI LSDYKGKY+V FFYPLDFTFVC PTEI I AFS + A EF+K+N 
Sbjct: 11 PAPNFKATAVMPDGQFKDI SLSDYKGKYWFFFYPLDFTFVCPTEI I AFSDRAEEFKKLN 70 

Query: 66 CEVTGCSTDSQFTHLAWINTPRKQGGLGSMDI PLLADKSMKVARDYGVLDEETGI PFRGL 125 

C+VTG S DS F HLAW+NTP+KQGGLG M+IPL++D +A+DYGVL + GI FRGL 
Sbjct: 71 CQVI GASVDSHFCHLAWvT3TPKKQGGLGPMNI PLVSDPKRTI AQDYGVLKADEGI S FRGL 130 

Query: 126 F I IDDKQI^LRQITVNDLPVGRS VEETLRLVQAFQYTDKYGEVC PANWKPGQKTMVADPTK 185 

FIIDDK LRQI TVNDLP VGRS V+ ETLRLVQAFQ+ TDK+ GEVC PA WKPG T+ D K 
Sbjct: 131 FI IDDKGI LRQITVNDLPVGRSVDETLRLVQAFQFTDKHGEVC PAGWKPGSDTIKPDVQK 190 

Query: 186 SKEYF 190 
SKEYF 

Sbjct: 191 SKEYF 195 
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Figure 23B. Multiple Sequence Alignment (ClustalW 1.83) 

Jafracl Dm MP QLQKPAPAFAGTAW-NGVFKDIKLSDYKGKYLVLFFYPLDFTFVCPTEIIAFS 

PRDX1 Hs MSSGNAKIGHPAPNFKATAVMPDGQFKDI SIiSDYKGKYWFFFYPLDFTFVCPTEI IAFS 
PRDX2 Hs MAS GNARI GKPAPDFKAT AW- DG AFKEVKL S D YKGKYWL FF YP LDFTFVC PT E 1 1 AF S 

Jafracl Dm ESAAEFRKINCEVTGCSTDSQFTHLAWINT 

PRDX1 HS DRAEEFKKLNCQVIGASVDSHFCHLAWV3SITPKKQGGLGPMNIPLVSDP 

PRDX2 HS NRAEDFRKLGCEVLGVS VDSQFTHLAWINTPRKEGGLGPLNI PLLADVTRRL SEDYGVLK 

Jafracl Dm EETGI PFRGLFI I DDKQNIiRQITVNDLPVGRS VEETLRLVQAFQYTDKYGEVC PANWKPG 
PRDX1 Hs ADEGI S FRGLFI I DDKGI LRQ I TVNDL PVGRSVDETLRLVQAFQFTDKHGEVC PAGWKPG 
PRDX2 Hs TDEGI AYRGLFI I DGKGVLRQI TVNDL PVGRSVDEALRLVQAFQYTDEHGEVC PAGWKPG 

Jafracl Dm QKTMVADPTKSKE YFETT S 
PRDX1 Hs SDTIKPDVQKSKEYFSKQK 
PRDX2 Hs SDTIKPNVDDSKEYFSKHN 
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Figure 25. Triglyceride content of a Drosophila CG14440 (GadFIy Accession Number) 
mutant 
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Figure 27. BLAST? results for CG14440 (GadFly Accession Number) 
Homology to human protein NP_060000.1 (GenBank Accession Number) 

ref |NP_060000.l| (NML017530) hypothetical protein LOC55565 [Homo sapiens] 
Length =370 

Score = 77.4 bits (189), Expect = 2e-13 

Identities = 41/106 (38%), Positives = 62/106 (57%) 

Query: 195 QGQSSRAQKAARRRSNESIEARERRLERNAARMRDKRAKESEAEYRVRL^ 254 

+ Q+ +K A RR NE +E R +RLER + +R E+ E VR ++ EA R++R 

Sbjct: 207 EAQTPSVRKWADRRQNEPLEVRLQRLERERTAKKSRRDNETPEEREVRRMRDREAKRLQR 2 66 

Query: 255 QNETEVQRTLRLMKNAARQRIiRRASETVEERKKRLAKAAERMRIAR 300 

ET+ QR RL ++ RL+RA+ET E+R+ RL + E R+ R 
Sbjct: 267 MQETDEQRARRLQRDREAMRLKRANETPEKRQARIilREREAKRLKR 312 



WO 03/092715 



PC17EP03/04650 



51/51 



.9 

-4-* 

? 



a 
.s 

JP 

o 

a 

o ■ 



5 

O 



.a 

a* 

J 



f 

i 



mmmmmm 



CM 




(12) INTERNATIONAL ABDICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property 
Organization 
International Bureau 

(43) Internationa) Publication Date 
13 November 2003 (13.11.2003) 



HI 



PCT 



(10) International Publication Number 

WO 2003/092715 A3 



(51) International Patent Classification 7 : A61K 38/17, 
48/00, C12Q 1/68, A01K 67/027, C12N 5/10, A61P 3/00 

(21) International Application Number: 



(22) Internationa] Filing Date: 

(25) Filing Language: 

(26) Publication Language: 



PCT/EP2003/004650 
2 May 2003 (02.05.2003) 
English 
English 



(30) Priority Data: 

02009883.6 
02010332.1 
02010948.1 



2 May 2002 (02.05.2002) EP 
7 May 2002 (07.05.2002) EP 
16 May 2002 (16.05.2002) EP 



(71) Applicant (for all designated States except US): DEVEL- 
OGEN AKTIENGESELLSCHAFT FUR ENTWICK- 
LUNGSBIOLOGISCHE FORSCHUNG [DE/DE]; 
Rudolf-Wissell-Strasse 28, 37079 Gottingen (DE). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): EULENBERG, 
Karsten [DE/DE]; Vom-Stein-Strasse 29, 37120 Boven- 
den (DE). STEUERNAGEL, Arnd [DE/DE]; Am 
Kirschberge 4, 37085 Gottingen (DE). HADER, Thomas 
[DE/DE]; Wiesenstrasse 17, 37073 Gottingen (DE). 
MEISE, Martin [DE/DE]; An derTranke 10, 37079 Got- 
tingen (DE). BRONNER, Giinter [DE/DE]; Springstrasse 
54, 37077 Gottingen (DE). 

(74) Agent: WEICKMANN & WEI CKMANN ; Postfach 86 
08 20, 81635 Mlinchen (DE). 



(81) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, PI, GB, GD, GE, GH, 
GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NI, NO, NZ, OM, PH, PL, PT, RO, RU, SC, SD, 
SE, SG, SK, SL, TJ, TM, TN, TR, TT, TZ, UA, UG, US, 
UZ, VC, VN, YU, ZA, ZM, ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT, BE, BG, CH, CY, CZ, DE, DK, EE, 
ES, FI, FR, GB, GR, HU, IE, IT, LU, MC, NL, PT, RO, 
SE, SI, SK, TR), OAPI patent (BF, BJ, CF, CG, CI, CM, 
GA, GN, GQ, GW, ML, MR, NE, SN, TD, TG). 

Declaration under Rule 4.17: 

— of inventorship (Rule 4. 1 7( iv))for US only 

Published: 

— with international search report 

— before the expiration of the time limit for amending the 
claims and to be republished in tfie e\>ent of receipt of 
amendments 

(88) Date of publication of the international search report: 

15 July 2004 

For two-letter codes and otfier abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



< 
in 

l-H 

OS 

rn 



(54) Title: PROTEINS INVOLVED IN THE REGULATION OF ENERGY HOMEOSTASIS 



(57) Abstract: The present invention discloses novel uses for energy homeostasis regulating proteins and polynucleotides encoding 
these in the diagnosis, study, prevention, and treatment of metabolic diseases and disorders. 



HvTT^SATJONAL SEARCH REPORT | ini Br „ al ^^ 



tniernational Application No 

PCT/(P 03/04650 



t P rf^6TK38/lT CT ^A61K48/0O C12Q1/68 A01K67/O27 C12N5/10 
A61P3/08 

According to International Patent aassiiicatton <IPC) or to both national ctesslficalicn and IPC 



B. FIELDS SEARCHED 

Minimum documentation searched (ctodHcafon system followed by classification symbols) 

IPC 7 A61K 



Documentation searched other than minimum documentation to the e asm that such documents are included m the fields searched 



Electronic data base consulted during Ihe International search (name of date base and, wr^prar^.saarcn terms uied) 

EPO-Internal , WPI Data, PAJ, BIOSIS, EM BASE, CHEM ABS Data 



C DOCUM1 

Category" 


;nts considered to be relevant : 

Citation of document, with Indication, where appropriate, of the relevant passages 


Relevant to claim No. 


Y 


WO 97/82048 A (MILLENNIUM PHARM INC) 
23 January 1997 (1997-01-23) 
claims 1,12,17,22 


1-15, 
18-29,31 


Y 


WO 97/19952 A (TARTAGLI A LOUIS A ; WHITE 
DAVID W (US) ; TEPPER ROBERT I (US); CULPE) 
5 June 1997 (1997-Q6-05) 
abstract; claims 14.27,28,37,38.50,62 


1-15, 
18-29,31 


Y 


FLEURY C ET AL: "UNCOUPLING PROTEIN-2: A 
NOVEL GENE LINKED TO OBESITY AND 
HYPERINSULINEMIA" 

NATURE GENETICS, NEW YORK. NY, US, 

vol. 15, no. 3. 1 March 1997 (1997-03-01), 

pages 269-272. XP002O64499 

ISSN: 1061-4036 

abstract 

page 271, column 2 

-/-- 


1-15, 
18-29,31 



j x[ Further documents are fisted in the continuation of box C. 



|X I Patent famDy 



members are listed in annex. 



° Special categories of cited documents : 

■A* document defining the general state of the art which is not 

considered to be of particular relevance 
T=- earlier docurnerit but published on or after the international 



V document which may throw doubts on priority daim(s) or 
which is cited to establish the publication date of another 
cftationor other special reason (as specified) 

"O" document referring to an oral disclosure, use, exhibition or 
other means 

"P" document published prior to the International filing date but 
later than the priority date claimed 



T later document published afterthe i^^^lSSS^hlrT 
or priority date and not in conflict with the appQration but 
cited to understand the principle or theory urutertvinn, the 
invention 

-X- document of particular relevance; the e&m^ inve^n 
cannot be considered novel or cannot be consmerep to 
involve an inventive step when the document is taken alone 

-T document of particular relevance; the claimed invention 

cannot be considered to involve an inventrve step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
In the art. 

"V document member of the same patent family 



Date of the actual completion of Ihe International search 

1 December 2903 



Dale of mailing of the international search report 

21 04 2004 



Name and mailing address ol the ISA 

European Patent Office, P.B. 6818 Patentlaan 2 
NL-2280HVRijswqk 
Tei (+31-70) 340-2040, Tx. 31 651 epo nl, 
Fax: (+31-70) 340-3016 



Authorized officer 



Gonzalez Ramon, N 



Form PCT7ISA/210 (second sheet) (January 2004) 



page 1 of 2 



INTE WTIONAL SEARCH REPORT 


1 nts rnatio t^^pp licati o n No 




PCT/EP' 03/64650 



C<Conltnuation) DOCUMENTS CONSIDERED TO BE RELEVANT 



Category • I Citation of document, w jtf, indication, where appropriate, of the relevant passages 



Relevant to claim No. 



FORTINI M. E. ET AL: "A survey of human 
disease gene counterparts in the 
drosophila genome" 
J CELL BIOL, 

vol . 150, no. 2, _ 

24 July 200O (2000-07-24), pages f23-f29, 

XPO62257035 

abstract; table 1 

page F26, paragraph 3 

page F29, paragraph 3 

CHI ESI M ET AL: "PHAMACOTHERAPY OF 
OBESITY: TARGETS AND PERSPECTIVES" 
TRENDS IN PHARMACOLOGICAL SCIENCES, 
ELSEVIER TRENDS JOURNAL, CAMBRIDGE, GB, 
vol. 22, no. 5, May 2001 (2001-05), pages 
247-254, XP001080G52 
ISSN: 0165-6147 
page 251, column 2; table 1 

"CG7956" 

FLYBASE.BIO.INDIANA.EDU, [Online] page 1, 
XP002256412 

Retrieved from the Internet: 

URL : http : //flybase . bi o. i ndi ana . edu/ . bi n/fb 

i dq . html ? FBgn003B89O&resul tl i s t=fbgn9869 . d 

ata[0]> [retrieved on 2003-10-02] 

abstract 



1-15, 
18-29,31 



1-15, 
18-29,31 



1-15, 
18-29,31 



( 



Rum PCT/1SA/21 0 (conSnuaSon ot second sheet) (January 2004) 



page 2 of 2 



INTERNATIONAL SEARCH REPORT 



~lnterfWonal application Mo. 

PCT/EP 83/84650 



Box I Observations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 



This International Search Report has not been established in respect of certain claims under Article 17(2)(a) for the following reasons: 
1 * E tec^se^'y relate to sub}ect^natte^ 

Although claim 20 is directed to a diagnostic method practised on the 
human/animal body, the search has been carried out and based on the alleged 
effects of the compound/composition. 

2 |X| claims mos • partially 1-31 

* 1 1 bscause they relate to parts of the International Application that do not comply with the prescribed requirements to such 

an extent that no meaningful International Search can be carried out, specifically: 

see FURTHER INFORMATION sheet PCT/ISA/210 



3 ' I— I Sca^etfiy are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box II Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: 

see additional sheet 

1 . I I As an required additional search lees were timely paid by the applicant, this International Search Report covers ail 
I ' searchable claims. 

2 . f~] As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 

of any additional fee. 

3 | | As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
I 1 covers only those claims lor which fees were paid, specifically claims Nos.: 



4 f"v~l No required additional search fees were timely paid by the applicant Consequently, this International Search Report r 
restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 

1-31 (partially) 



Remark on Protest Q I"" 8 additional search fees were accompanied by the applicant's protest. 

|^ j No protest accompanied the payment of additional search fees. 



Form PCTVlSA/21 0 (continuation of first sheet (1 )) (July 1 998} 



page 1 of 3 



International Application No. PCT/ EP 03/ 04650 



FURTHER INFORMATION CONTINUED FRO M PCT/ISA/ 21Q 

Continuation of Box 1.1 

Although claim 28 is directed to a diagnostic method practised on the 
human/animal body, the search has been # carried out and based on the 
alleged effects of the compound/composi tion. 



Continuation of Box 1.1 
Claims Nos.: 16, 17, 30 

Rule 39.1(i i) PCT - Animal variety 

Rule 39.1(ii) PCT - Essentially biological process for the production 
of animals 



Continuation of Box 1.2 
Claims Nos.: partially 1-31 




polypeptide encoded thereby 




page 2 of 3 



International Application No. PCT/ EP 03/04650 



FURTHER INFORMATION CONTINUED FROM PCT/ISA/ ZIP 



Chapter II procedure. If the application proceeds into the regional phase 
hPfore the EP0 the applicant is reminded that a search may be earned 
St SSrfS examiSioSbefore the EPO (see EPO Guideline C-VI. 8.5), 
should the problems which led to the Article 17(2) declaration be 
overcome . 



page 3 of 3 



International Application No. PCT/ EP 03/04650 



FURTHER INFORMATION CONTIN UED FROM PCT/ISA/ 210 

This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 
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Pharmaceutical composition comprising a C67656 nucleic acid 
molecule or a polypeptide encoded thereby. Use of the same 
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syndrome. Methods of screening using the same. 
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Pharmaceutical composition comprising an aralar 1 nucleic 
acid molecule or a polypeptide encoded thereby. Use of the 
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Pharmaceutical composition comprising a Jafrac 1 nucleic 
acid molecule or a polypeptide encoded thereby. Use of the 
same for the treatment of obesity, diabetes and/or metabolic 
syndrome. Methods of screening using the same. 
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Pharmaceutical composition comprising a CGI 4440 nucleic acid 
molecule or a polypeptide encoded thereby. Use of the same 
for the treatment of obesity, diabetes and/or metabolic 
syndrome. Methods of screening using the same. 
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